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I. INTRODUCTION 

More than eighty years after its formulation, quantum 
theory is still mysterious. The theory has a solid mathc- 
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matical foundation, addressed by Hilbcrt, von Neumann 
and Nordheim in 1928 0] and brought to completion in 
the monumental work by von Neumann 0. However, 
this formulation is based on the abstract framework of 
Hilbert spaces and self-adjoint operators, which, to say 
the least, are far from having an intuitive physical mean- 
ing. For example, the postulate stating that the pure 
states of a physical system are represented by unit vectors 
in a suitable Hilbert space appears as rather artificial: 
which are the physical laws that lead to this very spe- 
cific choice of mathematical representation? The prob- 
lem with the standard textbook formulations of quantum 
theory is that the postulates therein impose particular 
mathematical structures without providing any funda- 
mental reason for this choice: the mathematics of Hilbert 
spaces is adopted without further questioning as a pre- 
scription that "works well" when used as a black box 
to produce experimental predictions. In a satisfactory 
axiomatization of Quantum Theory, instead, the math- 
ematical structures of Hilbcrt spaces (or C*-algebras) 
should emerge as consequences of physically meaningful 
postulates, that is, postulates formulated exclusively in 
the language of physics: this language refers to notions 
like physical system, experiment, or physical process and 
not to notions like Hilbert space, self-adjoint operator, or 
unitary operator. Note that any serious axiomatization 
has to be based on postulates that can be precisely trans- 
lated in mathematical terms. However, the point with 
the present status of quantum theory is that there are 
postulates that have a precise mathematical statement, 
but cannot be translated back into language of physics. 
Those are the postulates that one would like to avoid. 

The need for a deeper understanding of quantum the- 
ory in terms of fundamental principles was clear since 
the very beginning. Von Neumann himself expressed 
his dissatisfaction with his mathematical formulation of 
Quantum Theory with the surprising words "I don't be- 
lieve in Hilbert space anymore" , reported by Birkhoff 
in u|. Realizing the physical relevance of the axioma- 
tization problem, Birkhoff and von Neumann made an 
attempt to understand quantum theory as a new form 
of logic ||||: the key idea was that propositions about 
the physical world must be treated in a suitable logical 
framework, different from classical logics, where the op- 
erations AND and OR are no longer distributive. This 
work inaugurated the tradition of quantum logics, which 
led to several attempts to axiomatize quantum theory, 
notably by Mackey || and Jauch and Piron [|| (see Ref. 
for a review on the more recent progresses of quan- 
tum logics). In general, a certain degree of technicality, 
mainly related to the emphasis on infinite-dimensional 
systems, makes these results far from providing a clear- 
cut description of quantum theory in terms of fundamen- 
tal principles. Later Ludwig initiated an axiomatization 
program fl8j adopting an operational approach, where the 
basic notions are those of preparation devices and mea- 
suring devices and the postulates specify how prepara- 
tions and measurements combine to give the probabilities 



of experimental outcomes. However, despite the original 
intent, Ludwig's axiomatization did not succeed in de- 
riving Hilbert spaces from purely operational notions, as 
some of the postulates still contained mathematical no- 
tions with no operational interpretation. 

More recently, the rise of quantum information science 
moved the emphasis from logics to information process- 
ing. The new field clearly showed that the mathematical 
principles of quantum theory imply an enormous amount 
of information-theoretic consequences, such as the no- 
cloning theorem || |n|, the possibility of teleportation 
Iprf , secure key distribution] 12|-|l4| , or of factoring num- 
bers in polynomial time JX5 ] . The natural question is 
whether the implication can be reversed: is it possible 
to retrieve quantum theory from a set of purely infor- 
mational principles? Another contribution of quantum 
information has been to shift the emphasis to finite di- 
mensional systems, which allow for a simpler treatment 
but still possess all the remarkable quantum features. In 
a sense, the study of finite dimensional systems allows 
one to decouple the conceptual difficulties in our under- 
standing of quantum theory from the technical difficulties 
of infinite dimensional systems. 

In this scenario, Hardy's 2001 work 16 rc-opened 
the debate about the axiomatizations of quantum theory 
with fresh ideas. Hardy's proposal was based on five main 
assumptions about the relation between dimension of the 
state space and the number of perfectly distinguishable 
states of a given system, about the structure of compos- 
ite systems, and about the possibility of connecting any 
two pure states of a physical system through a contin- 
uous path of reversible transformations. However, some 
of these assumptions directly refer to the mathematical 
properties of the state space (in particular, the "Sim- 
plicity Axiom" 2, which is an abstract statement about 
the functional dependence of the state space dimension 
on the number of perfectly distinguishable states). Very 
recently, building on Hardy's work there have been two 
new attempts of axiomatization by Dakic and Brukncr 
Jl7[ and Masanes and Miillcr . Although these works 
succeeded in removing the "Simplicity Axiom" , they still 
contain mathematical assumptions that cannot be under- 
stood in elementary physical terms (see e.g. requirement 
5 of Ref. which assumes that "all mathematically 

well-defined measurements are allowed by the theory"). 

Another approach to the axiomatization of quantum 
theory was pursued by one of the authors in a series of 
works |l!| culminated in Ref. ^0|. These works tack- 
led the problem using operational principles related to 
tomography and calibration of physical devices, experi- 
mental complexity, and to the composition of elementary 
transformations. In particular this research introduced 
the concept of dynamically faithful states, namely states 
that can be used for the complete tomography of phys- 
ical processes. Although this approach went very close 
to deriving quantum theory also in this case one math- 
ematical assumption without operational interpretation 
was needed (see the C J postulate of Ref. [Eoi ) . 
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In this paper we provide a complete derivation of finite 
dimensional quantum theory based of purely operational 
principles. Our principles do not refer to abstract prop- 
erties of the mathematical structures that we use to rep- 
resent states, transformations or measurements, but only 
to the way in which states, transformations and measure- 
ments combine with each other. More specifically, our 
principles are of informational nature: they assert basic 
properties of information-processing, such as the possibil- 
ity or impossibility to carry out certain tasks by manip- 
ulating physical systems. In this approach the rules by 
which information can be processed determine the phys- 
ical theory, in accordance with Wheeler's program "it 
from bit" , for which he argued that "all things physi- 
cal are information-theoretic in origin" f^j . Note that, 
however, our axiomatization of quantum theory is rele- 
vant, as a rigorous result, also for those who do not share 
Wheeler's ideas on the informational origin of physics. In 
particular, in the process of deriving quantum theory we 
provide alternative proofs for many key features of the 
Hilbcrt space formalism, such as the spectral decompo- 
sition of self-adjoint operators or the existence of projec- 
tions. The interesting feature of these proofs is that they 
are obtained by manipulation of the principles, without 
assuming Hilbert spaces form the start. 

The main message of our work is simple: within a stan- 
dard class of theories of information processing, quantum 
theory is uniquely identified by a single postulate: pu- 
rification. The purification postulate, introduced in Rcf. 
PH ) expresses a distinctive feature of quantum theory, 
namely that the ignorance about a part is always com- 
patible with the maximal knowledge of the whole. The 
key role of this feature was noticed already in 1935 by 
Schrodinger in his discussion about entanglement [|23|, 
of which he famously wrote "I would not call that one 
but rather the characteristic trait of quantum mechan- 
ics, the one that enforces its entire departure from clas- 
sical lines of thought" . In a sense, our work can be 
viewed as the concrete realization of Schrodinger's claim: 
the fact that every physical state can be viewed as the 
marginal of some pure state of a compound system is 
indeed the key to single out quantum theory within a 
standard set of possible theories. It is worth stressing, 
however, that the purification principle assumed in this 
paper includes a requirement that was not explicitly men- 
tioned in Schrodinger's discussion: if two pure states of a 
composite system AB have the same marginal on system 
A, then they are connected by some reversible transfor- 
mation on system B. In other words, we assume that all 
purifications of a given mixed state are equivalent under 
local reversible operations |24j . 

The purification principle expresses a law of conser- 
vation of information, stating that at least in principle, 
irreversibility can always be reduced to the lack of con- 
trol over an environment. More precisely, the purification 
principle is equivalent to the statement that every irre- 
versible process can be simulated in an essentially unique 
way by a reversible interaction of the system with an en- 



vironment, which is initially in a pure state MM. This 
statement can also be extended to include the case of 
measurement processes, and in that case it implies the 
possibility of arbitrarily shifting the cut between the ob- 
server and the observed system The possibility of 
such a shift was considered by von Neumann as a "fun- 
damental requirement of the scientific viewpoint" (see p. 
418 of Q]) and his discussion of the measurement process 
was exactly aimed to show that quantum theory fulfils 
this requirement. 

Besides Schrodinger's discussion on entanglement and 
von Neumann's discussion of the measurement process, 
the purification principle is deeply rooted in the structure 
of quantum theory At the purely mathematical level, it 
plays a crucial role in the theory of C*-algebras of op- 
erators on separable Hilbcrt spaces, where the purifica- 
tion principle is equivalent to the Gclfand-Naimark-Scgal 
(GNS) constructi on p5| and implies the celebrated Stine- 
spring's theorem f2qf ! On the other hand, purification is 
a cornerstone of quantum information, lying at the ori- 
gin of most quantum protocols. As it was shown in Rcf. 
pl| , the purification principle directly implies crucial fea- 
tures like no-cloning, teleportation, no-information with- 
out disturbance, error correction, the impossibility of bit 
commitment, and the "no-programming" theorem of Rcf. 

In addition to the purification postulate, our derivation 
of quantum theory is based on five informational axioms. 
The reason why we call them "axioms" , as opposed to 
the the purification "postulate" , is that they are not at 
all specific of quantum theory These axioms represent 
standard features of information-processing that every- 
one would, more or less implicitly, assume. They define a 
class of theories of information-processing that includes, 
for example, classical information theory, quantum infor- 
mation theory, and quantum theory with superselection 
rules. The question whether there are other theories sat- 
isfying our five axioms and, in case of a positive answer, 
the full classification of these theories is currently an open 
problem. 

Here we informally illustrate the five axioms, leaving 
the more detailed description to the remaining part of 
the paper: 

1. Causality: the probability of a measurement out- 
come at a certain time does not depend on the 
choice of measurements that will be performed 
later. 

2. Perfect distinguishability: if a state is not com- 
pletely mixed (i.e. if it cannot be obtained as a 
mixture from any other state), then there exists at 
least one state that can be perfectly distinguished 
from it. 

3. Ideal compression: every source of information can 
be encoded in a suitable physical system in a loss- 
less and maximally efficient fashion. Here lossless 
means that the information can be decoded with- 
out errors and maximally efficient means that every 
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state of the encoding system represents a state in 
the information source. 

4. Local distinguishability: if two states of a compos- 
ite system are different, then we can distinguish 
between them from the statistics of local measure- 
ments on the component systems. 

5. Pure conditioning: if a pure state of system AB 
undergoes an atomic measurement on system A, 
then each outcome of the measurement induces a 
pure state on system B. (Here atomic measurement 
means a measurement that cannot be obtained as 
a coarse-graining of another measurement). 

All these axioms are satisfied by classical information the- 
ory. Axiom 5 is even trivial for classical theory, because 
the only pure states of a composite system AB are the 
product of pure states of the component systems A and 
B, and hence the state of system B will be pure irrespec- 
tively of what we do on system A. 

A stronger version of axiom 5, introduced in Ref. [ pp| , 
is the following: 

5' Atomicity of composition: the sequential compo- 
sition of two atomic operations is atomic. (Here 
atomic transformation means a transformation 
that cannot be obtained from coarse-graining). 

However, it turns out that Axiom 5 is enough for our 
derivation: thanks to the purification postulate wc will 
be able to show the non-trivial implication: Axiom 5 => 
Axiom 5' (see lemma |l6|) . 

The paper is organized as follows. In Sec. || we review 
the framework of operational-probabilistic theories intro- 
duced in Ref. [^IJ . This framework will provide the basic 
notions needed for the formulation of our principles. In 
Sec. Ill we introduce the principles from which we will 



that such systems arc indeed two-dimensional quantum 
-a.k.a. 



XI 



we construct projec- 



derive Quantum Theory. In Sec. [V we prove some direct 
consequences of the principles that will be used later in 
the paper. In Sec. |V| we discuss the properties of per- 
fectly distinguishable states, while in Sec. [Vl| we prove 
the existence of a duality between pure states and atomic 
effects. 

The results about distinguishability and duality of pure 
states and atomic effects allow us to show in Sec. VII that 



every system has a well defined informational dimen- 
sion — the opera tiona l counterpart of the Hilbert space 
dimension. Sec. VIII contains the proof that every state 
can be decomposed as a convex combination of perfectly 
distinguishable pure states. Similarly, any element of the 
vector space spanned by the states can be written as a lin- 
ear combination of perfectly distinguishable states. This 
result corresponds to the spectral theorem for self-adjoint 
operators on complex Hilbert spaces. In Sec. IX we prove 



some results about the maximum teleportation probabil- 
ity, which allow us to derive a functional relation between 
the dimension of the state space and the number of per- 
fectly distinguishable states of the system. The mathe- 
matical representation of systems with two perfectly dis- 
tinguishable states is derived in Sec. [x|, where we prove 



systems — a.k.a. qubits. In Sec. 
tions on the faces of the state space of any system and 
prove their main properties. These results lead to the 
derivation of the ope ratio nal analogue of the superposi- 
tion principle in Sec. XII which allows to prove that sys- 
tems with the same number of perfectly dis tinguis hable 
states are operationally equivalent (Subsec. XII B ). The 
properties of the projections and the superposition prin- 
ciple are then exploited in Sec. |XIH| — where we extend 
the density matrix representation from qubits to higher- 
dimensional systems, thus proving that a system with d 
perfectly distinguishable states is indeed a quantum sys- 
tem with rf-dimc nsion al Hilbert space. We conclude the 
paper with Sec. XIV, where we review our results, dis- 



cussing future directions for this research. 



II. THE FRAMEWORK 

This Section provides a brief summary of the frame- 
work of operational-probabilistic theories, which was for- 
mulated in Ref. [^l] . Wc refer to Ref. [^l] for an exhaus- 
tive presentation of the details of the framework and of 
the ideas behind it. The operational-probabilistic frame- 
work combines the operational language of circuits with 
the toolbox of probability theory: on the one hand, ex- 
periments arc described by circuits resulting from the 
connection of physical devices, on the other hand each 
device in the circuit can have classical outcomes and the 
theory provides the probability distribution of outcomes 
when the devices are connected to form closed circuits 
(that is, circuits that start with a preparation and end 
with a measurement). 

The notions discussed in this section will allow us to 
draw a precise distinction between principles with an op- 
erational content and exclusively mathematical princi- 
ples: with the expression "operational principle" we will 
mean a principle that can be expressed using only the 
basic notions of the the operational-probabilistic frame- 
work. 



A. Circuits with outcomes 

A test represents one use of a physical device, like 
a Stcrn-Gerlach magnet, a beamsplitter, or a photon 
counter. The device will have an input system and an 
output system, labelled by capital letters. The corre- 
sponding test can have different classical outcomes, rep- 
resented by different values of an index ieX: 



A 




B 







Each outcome i G X corresponds to a possible event, 
represented as 



A I I B 
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We denote by Transf (A, B) the set of all events from A to 
B. The reason for this notation is that in the next sub- 
section the elements of Transf (A, B) will be interpreted 
as transformations with input system A and output sys- 
tem B. If A = B we simply write Transf(A) in place of 
Transf (A, A). 

A test with a single outcome will be called determin- 
istic. This name is justified by the fact that, if there is 
a single possible outcome, then this outcome will occur 
with certainty (cf. the probabilistic structure introduced 
in the next subsection). 

Two devices can be composed in a sequence, as long 
as the input system of the second device is equal to the 
output system of the first. The events in the composite 
test are represented as 



Oj . Similarly, for effects we will write (a^ | {bj | in place of 

a, <£> bj . 

Sequential and parallel composition commute: one 
has {sii ® 38j)( c € k ® %) = £?i&k ® 383% for every 
s^i,^j,^ki^l such that the output of (resp. 38 j) 
coincides with the input of % (resp. S{). 

When one of the two tests is the identity, we will omit 
the box and draw only a straight line, as in 



The rules summarized in this section define the op- 
erational language of circuits, which has been discussed 
in detail in a series of inspiring works by Coecke (see in 
particular Refs. [^9[ ||^]). The language of circuits allows 
one to represent the schematic of an experiment, like e.g. 



and are written in formulas as Sffii. 

For every system A one can perform the identity-test 
(or simply, the identity), that is, a test {^a} with a single 
outcome, with the property 




B _ 



A 


B 




A 




Si 





and also to represent a particular outcome of the exper- 
Wj G Transf (A, B) iment 



MS 3 € Transf(B,A) 



The subindex A will be dropped from J? a where there is 
no ambiguity. 

The letter I will be reserved for the trivial system, 
which simply means "nothing" p8| . A device with in- 
put (resp. output) system I is a device with no input 
(resp. no output). The corresponding tests will be called 
preparation-tests (resp. observation-tests). In this case 
we replace the input (resp. output) wire with a round 
portion: 




(resp. 



(1) 



In formulas we will write \pi) B (resp. (ffljL). The sets 
Transf (I, A) and Transf(A, I) will be denoted as St (A) and 
Eff(A), respectively. The reason for this special notation 
is that in the next subsection the elements of St(A) (resp. 
Eff(A)) will be interpreted as the states (resp. effects) of 
system A. 

From every pair of systems A and B one can form a 
composite system, denoted by AB. Clearly, composing 
system A with nothing still gives system A, in formula 
AI = IA = A. Two devices can be composed in parallel, 
thus obtaining a new device with composite input and 
composite output systems. The events in composite test 
are represented as 



In formula, the above circuit is given by 
(B fe | BC (^ ® J ) \pi) AC ■ 



B. Probabilistic structure: states, effects and 
transformations 

On top of the language of circuits, we put a proba- 
bilistic structure [pl| : we declare that the composition 
of a preparation-test {pijigx with an observation-test 
{OyjjgY gives rise to a joint probability distribution: 



-\~®i) = p(i,j) 



(2) 



with p(i,j) > and X^exEyeYP^i) = 1- In formula 
we write p(i,j) = (dj\pi). Moreover, if two experiments 
are run in parallel, we assume that the joint probability 
distribution is given by the product: 



= p{i,k)q{j,l) 



(3) 



c 




D 









and arc written in formulas as 



j ■ 



In the special 



case of states we will often write \pi) \o~j) in place of pi < 



where p(i,k) := (a k \p l ) ,q(j,l) := (h\aj). 

The probabilistic structure defined by Eq. (0) turns 
every event pi £ St(A) into a function pi : Eff(A) — > M, 
given by f>i(aj) '■= {a 3 \pi). If two events pi,p[ € St(A) 
induce the same function, then it is impossible to dis- 
tinguish between them from the statistics of the exper- 
iments allowed by our theory. This means that for our 
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purposes the two events are the same: accordingly, we 
will take equivalence classes with respect to the relation 
pi ~ p[ if pi = p\. To avoid introducing new notation, 
from now on we will assume that the equivalence classes 
have been taken since the start. We will identify the event 
Pi G St (A) with the corresponding function pi and will 
call it state. Accordingly, we will refer to preparation- 
tests as collections of states {pi}i e x- Note that, since 
one can take linear combinations of functions, the states 
in St (A) generate a real vector space, denoted by St R (A). 

The same construction holds for observation-tests: ev- 
ery event aj G Eff(A) induces a function aj : St(A) — >• R, 
given by a,j{pi) '■= {aj\pf). If two events a,j,o!j G Eff(A) 
induce to the same function, then it is impossible to dis- 
tinguish between them from the statistics of the exper- 
iments allowed in our theory. This means that for our 
purposes the two events are the same: accordingly, we 
will take equivalence classes with respect to the relation 
aj ~ a'j if a,j = a'j. To avoid introducing new notation, 
from now on we will identify the event aj G Eff(A) with 
the corresponding function hj and we will call it effect. 
Accordingly, we will refer to observation-tests as collec- 
tion of effects {ojligY- The effects in Eff(A) generate a 
real vector space, denoted by Effig(A). 

A vector in Stffi(A) (resp. Effa(A)) can be extended 
to a linear function on Effffi(A) (resp. StR(A)). In this 
way, states and effects can be thought as elements of two 
real vector spaces, one dual to the other. In this paper 
we will restrict our attention to finite dimensional vector 
spaces: operationally, this means that the state of a given 
physical system is completely determined by the statis- 
tics of a finite number of finite-outcome measurements. 
The dimension of the vector space St]g(A), which by con- 
struction is equal to the dimension of its dual Effjg(A), 
will be denoted by Da- We will refer to Da as the size 
of system A. 

Finally, the vector spaces StR(A) and Effj{(A) can be 
equipped with suitable norms, which have an operational 
meaning related to optimal discrimination schemes 
The norm of an element S G StR(A) is given by |2l| 

||<5|| = sup {a \S) - inf (ai\6) , 

<J eEff(A) aiGEff(A) 

while the norm of an element £ G EfpR(A) is given by 

m= ^p \(t\ P )\. 

peSt(A) 

We will always take the set of states St(A) to be closed 
in the operational norm. The convenience of this choice is 
the convenience of using real numbers instead of rational 
ones: dealing with a single real number is much easier 
than dealing with a Cauchy sequence of rational numbers. 
Operationally, taking St(A) to be closed is very natural: 
the fact that there is a sequence of states {p n }^Li that 
converges to p G Stu(A) means that there is a procedure 
to prepare p with arbitrary precision and hence that p 
deserves the name of "state" . 



We conclude this Subsection by noting that every event 
% from A to B induces a linear map "jffc from StR(A) to 
Stit(B), uniquely defined by 

tf fc : \p) G St(A) h-> % \p) G St(B). 

Likewise, for every system C the event %(8)^c induces 
a linear map % <g> J^c from StR(AC) to StR(BC). If two 
events % and ^ induce the same maps for every possi- 
ble system C, then there is no experiment in the theory 
that is able to distinguish between them. This means 
that for our purposes the two events are the same: ac- 
cordingly, we will take equivalence classes with respect 

to the relation c £ k ~ ^ if % ® = ^ <g> J c for every 
system C. In this case, we will say that two events repre- 
sent the same transformation. Accordingly, we will refer 
to tests {^}iex as collections of transformations. The 
deterministic transformations (corresponding to single- 
outcome tests) will be called channels. 

C. Basic definitions in the operational-probabilistic 
framework 

Here we summarize few elementary definitions that will 
be used later in the paper. The meaning of the definitions 
in the case of quantum theory is also discussed. 

1. Coarse-graining, refinement, atomic transformations, 
pure, mixed and completely mixed states 

First, we start from the notions of coarse- graining and 
refinement. Coarse-graining arises when we join together 
some outcomes of a test: we say that the test {%}jeY is 
a coarse-graining of the test {^}igx if there is a disjoint 
partition {Xj}j £ y of X such that 

Conversely if {@j}jeY is a coarse-graining of {%}i e x, 
we say that {^jigx is a refinement of {^,}j 6 y- Intu- 
itively, a test that refines another is a test that extracts 
information in a more precise way: it is a test with better 
"resolving power" . 

The notion of refinement also applies to a single trans- 
formation: a refinement of the transformation ^ is given 
by a test {^jigx and a subset Xo such that 

c 6 = y ' 

Accordingly, we say that each transformation ^ , i G Xo 
is a refinement of ^ . A transformation ^ is atomic if it 
has only trivial refinements: if ^ refines ^, then ^ = fitf 
for some probability p > 0. A test that consists of atomic 
transformations is a test whose "resolving power" cannot 
be further improved. 
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When discussing states (i.e. transformations with triv- 
ial input) we will use the word pure as a synonym of 
atomic. A pure state describes a situation of maximal 
knowledge about the system's preparation, a knowledge 
that cannot be further refined. 

As usual, a state that is not pure will be called mixed. 
An important notion is that of completely mixed state: 

Definition 1 (Completely mixed state) A state is 
completely mixed if any other state can refine it: pre- 
cisely, uj £ St (A) is completely mixed if for every p £ 
St(A) there is a non-zero probability p > such that pp 
is a refinement ofuj. 

Intuitively, a completely mixed state describes a situation 
of complete ignorance about the system's preparation: if 
a system is described by a completely mixed state, then it 
means that we know so little about its preparation that, 
in fact, every preparation is possible. 

We conclude this paragraph with a couple of definitions 
that will be used throughout the paper: 

Definition 2 (Reversible transformation) A trans- 
formation % € Transf(A,B) is reversible if there exists 
another transformation tyf^ 1 G Transf(B,A) such that 
= y A and WW- 1 = J B - When A = B the re- 
versible transformations form a group, indicated as Ga • 

Definition 3 (Operationally equivalent systems) 

Two systems A and B are operationally equivalent if 
there exists a reversible transformation from A to B. 

When two systems are operationally equivalent one can 
convert one into the other in a reversible fashion. 



2. Examples in in quantum theory 

Consider a quantum system with Hilbert space Ji? = 
C d ,d < oo. In this case a preparation-test is a collection 
of unnormalized density matrices {pi}iex (i-c. of non- 
negative d x d complex matrices with trace bounded by 
1) such that 

E Tr ^ = 1 - 

iSX 

Preparation-tests are often called quantum information 
sources in quantum information theory. A generic state 
p is an unnormalized density matrix. A deterministic 
state, corresponding to a single-outcome preparation-test 
is a normalized density matrix p, with Tr[p] = 1. 

Diagonalizing p = cti\ipi) (il>i\ we then obtain that 
each matrix ai\ipi){ipi\ is a refinement of p. More gen- 
erally, every matrix a such that a < p is a refinement 
of p. Up to a positive rescaling, all matrices with sup- 
port contained in the support of p are refinements of p. 
A quantum state p is atomic (pure) if and only if it is 
proportional to a rank-one projection. A quantum state 
is completely mixed if and only if its density matrix has 



full rank. Note that the quantum state \ = -4, where 
I c i is the identity d x d matrix, is a particular example of 
completely mixed state, but not the only example. Pre- 
cisely, x — 4f is the unique unitarily invariant state in 
dimension d. 

Let us now consider the case of observation-tests: in 
quantum theory an observation-test is given by a PO VM 
(positive operator-valued measure), namely by a collec- 
tion {Pj}j e y of non-negative d x d matrices such that 

An effect is then a non-negative matrix P > upper 
bounded by the identity. In quantum theory there is 
only one deterministic effect, corresponding to a single- 
outcome observation test: the unique deterministic effect 
given by the identity matrix. As we will sec in the fol- 
lowing section, the fact that the deterministic effect is 
unique is equivalent to the fact that quantum theory is a 
causal theory. 

An effect P is atomic if and only if P is proportional 
to a rank-one projector. An observation-test is atomic if 
it is a POVM with rank-one elements. 

Finally, a general test from an input system with 
Hilbert space 3tf\ = C dl to an output system with Hilbert 
space = C d2 is given by a quantum instrument, 
namely by a collection {%}fcgz of completely positive 
trace non-increasing maps sending linear operators on 
M\ to linear operators on with the property that 

% :=E^ 

kez 

is trace-preserving. A general transformation is then 
given by a trace non-increasing map, called quantum op- 
eration, whereas a deterministic transformation, corre- 
sponding to a single-outcome test, is given by a trace- 
preserving map, called quantum channel. 

Any quantum operation ^ can be written in the Kraus 
form c /(p) = Y,i C iP c l where c * : M{ -> ^ are the 
Kraus operators. Up to a positive scaling, every quan- 
tum operation 9 such that the Kraus operators of £F 
belong to the linear span of the Kraus operators of 'if? 
is a refinement of ^ . A map 'H? is atomic if and only if 
there is only one Kraus operator in its Kraus form. A 
reversible transformation in quantum theory is a unitary 
map ty{p) = UpU^, where U : M\ — > is a unitary 
operator, that is U'U = I\ and UTP = I2 where I\ 
(J2) is the identity operator on M\ (^2). Two quantum 
systems are operationally equivalent if and only if the 
corresponding Hilbert spaces have the same dimension. 

D. Operational principles 

We are now in position to make precise the usage of 
the expression "operational principle" in the context of 
this paper. By "operational principle" we mean here a 
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principle that can be stated using only the opcrational- 
probablistic language, i.e. using only 

• the notions of system, test, outcome, probability, 
state, effect, transformation 

• their specifications: atomic, pure, mixed, com- 
pletely mixed 

• more complex notions constructed from the above 
terms (e.g. the notion of "reversible transforma- 
tion"). 

The distinction between operational principles and 
principles referring to abstract mathematical properties, 
mentioned in the introduction, should now be clear: for 
example, a statement like "the pure states of a system 
cannot be cloned" is a valid operational principle, be- 
cause it can be analyzed in basic operational-probabilistic 
terms as "for every system A there exists no transforma- 
tion ^ with input system A and output system AA such 
that \ip) = \ip) \(p) for every pure state ip of A ". On 
the contrary, a statement like "the state space of a sys- 
tem with two perfectly distinguishable states is a three- 
dimensional sphere" is not a valid operational principle, 
because there is no way to express what it means for a 
state space to be a three-dimensional sphere in terms of 
basic operational notions. The fact that a state spate 
is a sphere may be eventually derived from operational 
principles, but cannot be assumed as a starting point. 



III. THE PRINCIPLES 

We now state the principles used in our derivation. 
The first five principles express generic features that are 
shared by both classical and quantum theory. They could 
be even included in the definition of the background 
framework: they define the simple model of information 
processing in which we try to single out quantum theory. 
For this reason we will call them axioms. The sixth prin- 
ciple in our derivation has a different status: it expresses 
the genuinely quantum features. A major message of our 
work is that, within a broad class of theories of informa- 
tion processing, quantum theory is completely described 
by the purification principle. To emphasize the special 
role of the sixth principle we will call it postulate, in anal- 
ogy with the parallel postulate of Euclidean geometry. 



A. Axioms 

1. Causality 

The first axiom of our list, causality is so ba- 

sic that could be considered as part of the background 
framework. We decided to explicitly present it as an ax- 
iom for two reasons: The first reason is that the frame- 
work of operational-probabilistic theories can be devel- 
oped even without this requirement (see Rcf.pl| for the 



general framework and Rcfs. |3^] for two explicit ex- 
amples of non-causal theories) . The second reason is that 
we want to stress that causality is an essential ingredi- 
ent in our derivation. This observation is important in 
view of possible extensions of quantum theory to quan- 
tum gravity scenarios where the causal structure is not 
defined from the start (see e.g. Hardy in Ref. J33[). 

Axiom 1 (Causality) The probability of preparations 
is independent of the choice of observations. 

In technical terms: if {pi}i 6 x C St(A) is a preparation- 
test, then the conditional probability of the preparation 
Pi given the choice of the observation-test {ay-j^gy is the 
marginal 



p(i\{a,j}) 



E 



( a j\pi) 



The axiom states that the marginal probability p (i\{aj}) 
is independent of the choice of the observation-test {oj}: 
if {cij}j£Y and {bk}kez arc two different observation- 
tests, then one has p(i\{aj}) = p(i\{bk}). Loosely speak- 
ing, one may refer to causality as a requirement of no- 
signalling from the future: indeed, causality is equivalent 
to the fact that the probability of an outcome at a cer- 
tain time does not depend on the choice of operations 
that will be done at later times [|o| . 

An operational-probabilistic theory that satisfies the 
causality axiom |l| will be called causal. As we already 
mentioned, causality is a very basic requirement and 
could be considered as part of the framework: it provides 
the notions used to state the other axioms and it implies 
several facts that will be used frequently in the paper. In 
fact, in our derivation we do not use the causality axiom 
directly, but only through its consequences. In the fol- 
lowing we briefly summarize the facts and the notations 
that characterize the framework of causal operational- 
probabilistic theories, introduced and discussed in detail 
in Ref. 21 . Similar structures have been subsequently 
considered in Refs. J34|, ^5) within a formal description 
of circuits in foliable spacetime regions. 

First, causality is equivalent to the existence of an ef- 
fect eA such that ex = X^ex a j f° r everv observation-test 
{aj}j£Y- We call the effect eA the deterministic effect 
for system A. By definition, the effect ca is unique. The 
subindex A in ca will be dropped when no confusion can 
arise. 

In a causal theory every test {^}»ex C Transf (A, B) 
satisfies the condition 



As a consequence, a transformation <G Transf (A, B) 
satisfies the condition 



(e B |^< (e A | 



(4) 
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with the equality if and only if ^ is a channel (i.e. a 
deterministic transformation, corresponding to a single- 
outcome test). In Eq. (||) we used the notation (a\ < (a'| 
to mean (a\p) < (a'\p) for every p £ St(A). 

In a causal theory the norm of a state pi £ St(A) is 
given by ||pi|| = (e\pi). Accordingly, one can define the 
normalized state 

Pi 

Pi ■= T\ — \ • 

In a causal theory one can always allow for reseated 
preparations: conditionally to the outcome i £ X in the 
preparation-test {p^jigx we can say that we prepared 
the normalized state pi. For this reason, every state in a 
causal theory is proportional to a normalized state. 

The set of normalized states will be denoted by Sti(A). 
Since the set of all states St(A) is closed in the operational 
norm, also the set of normalized states Sti (A) is closed. 
Moreover, the set Sti(A) is convex [pif : this means that 
for every pair of normalized states p\ , pi £ Sti (A) and 
for every probability p £ [0,1] the convex combination 
p p = ppi + (l—p)p2 is a normalized state. Operationally, 
the state p p is obtained by 

1. performing a binary test with outcomes {1, 2} and 
outcome probabilities pi = p and pi = 1 — p 

2. for outcome i preparing pi, thus realizing the 
preparation-test {piPi\i=i,2 

3. coarse-graining over the outcomes, thus obtaining 
p v =ppi + (l-p)p2- 

The step 2 (preparation of a state conditionally on the 
outcome of a previous test) is possible because the theory 
is causal [f2l| . 

The pure normalized states are the extreme points of 
the convex set Sti(A). For a normalized state p £ Sti(A) 
we define the face identified by p as follows: 

Definition 4 (Face identified by a state) The face 
identified by p £ Sti(A) is the set F p of all normalized 
states a £ Sti(A) such that p = pa + (1 — p)r, for some 
non-zero probability p > and some normalized state 
t £ Sti(A). 

In other words, F p is the set of all normalized states that 
show up in the convex decompositions of p. Clearly, if 
tp is a pure state, then one has F v = {<p}. The opposite 
situation is that of completely mixed states: by definition 
[I], a state lu £ Sti(A) is completely mixed if every state 
a £ Sti(A) can stay in its convex decomposition, that is, 
if = Sti(A). An equivalent condition for a state to be 
completely mixed is the following: 

Lemma 1 A state uj £ Sti (A) is completely mixed if and 
only z/Span(F u ) = St R (A). 

Proof. The condition is clearly necessary. It is also 
sufficient because for a state a £ Sti (A) the relation a £ 
Span(F w ) implies a £ F^ (see Lemma 16 of Ref . pl[ ) . ■ 

A completely mixed state can never be distinguished 
from another state with zero error probability: 



Proposition 1 Let p £ Sti (A) be a completely mixed 
state and a £ Sti (A) be an arbitrary state. Then, the 
probability of error in distinguishing p from a is strictly 
greater than zero. 

Proof. By contradiction, suppose that one can dis- 
tinguish between p and a with zero error probability. 
This means that there exists a binary test {a p ,a a } such 
that [a p \a) = (a a \p) = 0. Since p is completely mixed 
there exists a probability p > and a state t <S Sti (A) 
such that p = pa + (1 — p)r. Hence, the condition 
(a a \p) = implies (a a \a) = 0. Therefore, we have 
(a p \a) + (a a \a) = 0. This is in contradiction with the 
normalization of the probabilities in the test {a p ,a a }, 
which would require (a p \a) + {a a \a) = 1. ■ 



2. Perfect distinguishability 

Our second axiom regards the task of state discrimina- 
tion. As we saw in proposition |l|, if a state is completely 
mixed, then it is impossible to distinguish it perfectly 
from any other state. Axiom || states the converse: 

Axiom 2 (Perfect distinguishability) Every state 
that is not completely mixed can be perfectly distinguished 
from some other state. 

Note that the statement of axiom ^| holds for quantum 
and for classical information theory In quantum theory a 
completely mixed state is a density matrix with full rank. 
If a density matrix p has not full rank, then it must have 
a kernel: hence, every density matrix a with support in 
the kernel of p will be perfectly distinguishable from p, 
as stated in Axiom Applying the same reasoning for 
density matrices that arc diagonal in a given basis, one 
can easily see that Axiom || is satisfied also by classical 
information theory. 

To the best of our knowledge, the perfect distinguisha- 
bility property is has never been considered in the lit- 
erature as an axiom, probably because in most works it 
came for free as a consequence of stronger mathemati- 
cal assumptions. For example, one can obtain the per- 
fect distinguishability property from the no-restriction 
hypothesis of Ref. pl| , stating that for every system 
A any binary probability rule (i.e. any pair of positive 
functionals aQ,ai £ Effn(A) such that oq + a\ = ga) ac- 
tually describes a measurement allowed by the theory. 
This assumption was made e.g. in Ref. in the case 
of systems with at most two distinguishable states (see 
requirement 5 of Ref. [|l8)). Note that the difference 
between the perfect distinguishability Axiom and the no- 
restriction hypothesis is that the former can be expressed 
in purely operational terms, whereas the latter requires 
the notion of "positive functional" which is not part of 
the basic operational language. 
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3. Ideal compression 

The third axiom is about information compression. An 
information source for system A is a preparation-test 
{PijieXj where each pi £ St(A) is an unnormalized state 
and X^iex( e IPi) = 1- A compression scheme is given 
by an encoding operation § from A to a smaller system 
C, that is, to a system C such that Dc < Da- The 
compression scheme is lossless for the source {pi}i S x if 
there exists a decoding operation S from C to A such 
that $1$ | pi) = | pi) for every value of the index j g X. 
This means that the decoding allows one to perfectly re- 
trieve the states {pi}i 6 x- We say that a compression 
scheme is lossless for the state p, if it is lossless for every 
source {pi}i S x such that p = X^ex/ 3 *- Equivalently, this 
means that the restriction of $>§ to the face identified by 
p is equal to the identity channel: \a) = a for every 
a£F p . 

A lossless compression scheme is maximally efficient 
if the encoding system C has the smallest possible size, 
that is, if the system C has no more states than exactly 
those needed to compress p. This happens when every 
normalized state r £ Sti(C) comes from the encoding of 
some normalized state a £ F p , namely |r) = $ \o~). 

We say that a compression scheme that is lossless and 
maximally efficient is ideal. Our second axiom states that 
ideal compression is always possible: 

Axiom 3 (Ideal compression) For every state there 
exists an ideal compression scheme. 

It is easy to see that this statement holds in quantum 
theory and in classical probability theory. For example, 
if p is a density matrix on a d-dimensional Hilbcrt space 
and rank(p) = r, then the ideal compression is obtained 
by just encoding p in an r-dimensional Hilbert space. As 
long as we do not tolerate losses, this is the most efficient 
one-shot compression we can devise in quantum theory. 
Similar observations hold for classical information theory. 



Sti(AB) using only local operations and classical com- 
munication and achieving an error probability strictly 
larger than p ran = 1/2, the probability of error in ran- 
dom guess |2l|| . Again, this statement holds in ordinary 
quantum theory (on complex Hilbert spaces) and in clas- 
sical information theory. 

Another equivalent condition to local distinguishabil- 
ity is the local tomography axiom, introduced in Rcfs. 
]l9| , ^6). The local tomography axiom state that every 
bipartite state can be reconstructed from the statistics 
of local measurements on the component systems. Tech- 
nically, local tomography is in turn equivalent to the re- 
lation Dab = DaDb [[l6] and to the fact that every state 
p £ St(AB) can be written as 

d a d b 
i=i j=i 

where {oj}^ ({/3j}f=i) is a basis for the vector space 
Stffi(A) (Stu(B)). The analog condition also holds for 
effects: every bipartite effect E £ Eff (AB) ben be written 
as 



E 



D A D B 

^2 ]l E v ai 

»=1 3=1 



where {ai}fj\ ({bj}?^) is a basis for the vector space 
Eff H (A) (Eff R (B)). 

An important consequence of local distinguishability, 
observed in Ref. |2lj|, is that a transformation ^ £ 
Transf (AB) is completely specified by its action on St(A): 
thanks to local distinguishability we have the implication 



|p) = <jf |p) Vp g St(A) 



(5) 



(see Lemma 14 of Ref.^lJ for the proof). Note that Eq. 
(||) does not hold for quantum theory on real Hilbert 
spaces Q. 



4- Local distinguishability 

The fourth axiom consists in the assumption of local 
distinguishability, here presented in the formulation of 
Ref. |l). 

Axiom 4 (Local distinguishability) If two bipartite 
states are different, then they give different probabilities 
for at least one product experiment. 

In more technical terms: if p, a £ Sti(AB) are states 
and p / (i, then there are two effects a £ Eff (A) and 
b £ Eff(B) such that 



-DD 

-LD 



-[a) 

-LD 



Local distinguishability is equivalent to the fact that 
two distant parties, holding systems A and B, respec- 
tively, can distinguish between the two states p, a £ 



5. Pure conditioning 

The fourth axiom states how the outcomes of a mea- 
surement on one side of a pure bipartite state can in- 
duce pure states on the other side. In this case we con- 
sider atomic measurements, that is, measurements de- 
scribed by observation-tests {ai}i S x where each effect 
dj is atomic. Intuitively, atomic measurement are those 
with maximum "resolving power" . 

Axiom 5 (Pure conditioning) If a bipartite system is 
in a pure state, then each outcome of an atomic measure- 
ment on one side induces a pure state on the other. 

The pure conditioning property holds in quantum the- 
ory and in classical information theory as well. In fact, 
the statement is trivial in classical information theory, 
because the only pure bipartite states are the product of 
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pure states: no matter which measurement is performed 
on one side, the remaining state on the other side will 
necessarily be pure. 

The pure conditioning property, as formulated above, 
has been recently introduced in Ref. J37J. A stronger 
version of axiom 5 is the atomicity of composition intro- 
duced in Ref. ||: 

5' Atomicity of composition: the sequential composi- 
tion of two atomic operations is atomic. 

Since pure states and atomic effects are a particular case 
of atomic transformations, Axiom 5' implies Axiom 5. 
In our derivation, however, also the converse implication 
holds: indeed, thanks to the purification postulate we 
will be able to show that Axiom 5 implies Axiom 5' (see 
lemma flG) . 



B. The purification postulate 

The last postulate in our list is the purification postu- 
late, which was introduced and explored in detail in Ref. 

While the previous axioms were also satisfied by 
classical probability theory, the purification axiom intro- 
duces in our derivation the genuinely quantum features. 
A purification of the state p £ Sti(A) is a pure state ^ p 
of some composite system AB, with the property that p 
is the marginal of p , that is, 



ay 



Here we refer to the system B as the purifying system. 
The purification axiom states that every state can be 
obtained as the marginal of a pure bipartite state in an 
essentially unique way: 

Postulate 1 (Purification) Every state has a purifica- 
tion. For fixed purifying system, every two purifications 
of the same state are connected by a reversible transfor- 
mation on the purifying system. 

Informally speaking, our postulate states that the igno- 
rance about a part is always compatible with a maximal 
knowledge of the whole. The existence of pure bipartite 
states with mixed marginal was already recognized by 
Schrodinger as the characteristic trait of quantum theory 
p3| . Here, however, we also emphasize the importance 
of the uniqueness of purification up to reversible trans- 
formations: this property sets up a relation between pure 
states and reversible transformations that generates most 
of the structure of quantum theory. As shown in Ref. 
pl[ , an impressive number of quantum features are ac- 
tually direct consequences of purification. In particular, 
purification implies the possibility of simulating any ir- 
reversible process through a reversible interaction of the 
system with an environment that is finally discarded. 



IV. FIRST CONSEQUENCES OF THE 
PRINCIPLES 



A. Results about ideal compression 

Let p £ Sti(A) be a state and let § £ Transf(A, C) 
(resp. S £ Transf(C,A)) be its encoding (resp. decod- 
ing) in the ideal compression scheme of Axiom ^ 

Essentially, the encoding operation S £ Transf(A, C) 
identifies the face F p with the state space Sti(C). In the 
following we provide a list of elementary lemmas show- 
ing that all statements about F p can be translated into 
statements about Sti(C) and vice- versa. 

Lemma 2 The composition of decoding and encoding is 
the identity on C, namely SS = 

Proof. Since the compression is maximally efficient, for 
every state r £ Sti(C) there is a state a £ F p such that 
S'a = t. Using the fact that S<So = a (the compression 
is lossless) we then obtain SSt = SSSa = $0 = r. By 
local distinguishability [see Eq. @], this implies <SS = 
J?c- ■ 

Lemma 3 The image o/Sti(C) under the decoding op- 
eration S is F p . 

Proof. Since the compression is maximally efficient, for 
all t £ Sti(C) there exists a £ F p such that t = £0. 
Then, St = SSo = a. This implies that ^(Sti(C)) C 
F p . On the other hand, since the compression is lossless, 
for every state a £ F p one has SSa = a. This implies 
the inclusion F p C ^(Sti(C)). ■ 

Lemma 4 // the state tp £ F p is pure, then the state 
<§<p £ Sti(C) is pure. If the state ip £ Sti(C) is pure, 
then the state Sip £ F p is pure. 

Proof. Suppose that <p £ F p is pure and that S'tp can 
be written as Sip = pa + (1 — p)r for some p > and 
some a, r £ Sti(C). Applying S on both sides we obtain 
(p = pSo + (1 — p)St. Since ip is pure we must have 
So = St = cp. Now, applying S on all terms of the 
equality and using lemma ^| we obtain a = r = $<p. 
This proves that S'cp is pure. Conversely, suppose that 
V> £ Sti(C) is pure and Sip = pa + (1 — p)r for some 
p > and some o~,t £ Sti(A). Since Sip is in the face F p 
(lemma |3|), also a and t are in the same face. Applying 
§ on both sides of the equality Slip = pa + (1 — p)r and 
using lemma |^ we obtain tp = SSip = p<§o + (1 — p)§T. 
Since tp is pure we must have <So = St = tp. Applying 
S on all terms of the equality we then have a = r = Sip, 
thus proving that Sip is pure. I 

We say that a state a £ F p is completely mixed relative 
to the face F p if every state t £ F p can stay in the con- 
vex decomposition of a. In other words, a is completely 



mixed relative to F p if one has F a 
general o £ F p implies F„ C F p . 
We then have the following: 



Fp. Note that in 
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Lemma 5 If the state u> G F p is completely mixed rel- 
ative to F p , then the state Slo G Sti(C) is completely 
mixed. If the state v G Sti(C) is completely mixed, then 
the state 9v G F p is completely mixed relative to F p . 

Proof. Suppose that U) is completely mixed relative to 
F p . Then every state a G F p can stay in its convex 



decomposition, say oj = pa -+ 
<t' £ F„. Applying § we have 



(1 — p)o~' with p > and 



gu = pSa + (1 - p)ia'. 



(0) 



Since the compression is maximally efficient, for every 
state t 6 Sti(C) there exists a state a G F p such that 
t = So. Choosing the suitable a € F p and substituting 
t to §o~ in Eq. (|g) we then obtain that for every state 
t G Sti(C) there exists probability p > and a state 
er' G F p such that 

Su) = pr + (1 — p)Sa . 

This implies that <?u; is completely mixed. Suppose now 
that v G Sti(C) is completely mixed. Then every state 
t G Sti(C) can stay in its convex decomposition, say 
v = pr+(l —p)t'. with p > and r' G Sti(C). Applying 
5? on both sides we have 



9v =p@T + (1 -p)@r'. 



(7) 



Now, using lemma |^ we have that every state a £ F p can 
be written as a = 9t for some r G Sti(C). Choosing the 
suitable t G Sti (C) and substituting a to *3t in Eq. (Q) 
we then obtain that for evert state a G F p there exists 
a probability p > and a state r' G Sti(C) such that 
9v = pa + (1 — p)2>t' . Therefore, S)v is completely 
mixed relative to F p . ■ 

We now show that the system C used for ideal com- 
pression of the state p is unique up to operational equiv- 
alence: 

Lemma 6 If two systems C and C allow for ideal com- 
pression of a state p G Sti(A), then C and C are opera- 
tionally equivalent. 

Proof. Let g, 9 and £", 9' denote the encod- 
ing/decoding schemes for systems C and C, respectively. 
Define the transformations <% := S '9 G Transf (C, C) 
and f = S3)' G Transf (C, C). It is easy to see that 
is reversible and = "V . Indeed, since the restric- 

tion of 9'g' and 9g to the face F p is the identity, using 
Lemma | one has 9' '2> = 9 and similarly = 9' . 

Hence, we have <%r = g'®g®' = £' 2>' = Jc and 
= g® '§'9 = SS = J Q M 

It is useful to introduce the notion of equality upon 
input of p. We say that two transformations s/, s/' G 
Transf (A, B) are equal upon input of p G St(A) if their 
restrictions to the face identified by p are equal, that is, 
if si a = sf'a for every a G F p . If si and si' are equal 
upon input of p we write si = p si' . 

Using the notion of equality upon input of p we can 
rephrase the fact that the compression is lossless for p as 
2>g = p J?a- Similarly, we can state the following: 



Lemma 7 The encoding § is deterministic upon input 
of p, that is (ec| $ = P (ca|- 

Proof. For every a G F p we have (eg $ \o~) > 
(ca| 2>S \a) = (e\\a) = 1, having used Eq. (|) and the 
fact that the compression is lossless. Since probabilities 
are bounded by 1, this implies (ec|<^|c) = (eA|c) for 
every a G F p , that is, (ec| $ =p {za\- B 
A similar result holds for the decoding: 



Lemma 8 The decoding 
(e A \9= (e c |. 



is deterministic, that is 



Proof. For every r G Sti(A) we have (ca\ 9 \t) > 
{e,o\S9\T) — (ec|r), having used Eq. (^) and lemma 
0. Hence, (6a| & = (ec|- ■ 



B. Results about purification 

The purification postulate [l] implies a large number of 
quantum features, as it was shown in Ref. EM . Here we 
review only the facts that are useful for our derivation, 
referring to Ref. |21j] for the proofs. 

An elementary consequence of the uniqueness of purifi- 
cation is that the group Ga of reversible transformations 
on A acts transitively on the set of pure states: 

Lemma 9 (Transitivity on pure states) For every 
couple of pure states tp, tp' G Sti (A) there is a reversible 
transformation °}/ G Ga such that ip' = °?/ 'tp. 

Proof. Sec Lemma 20 of Ref. ■ 

Transitivity implies that for every system A there is 
a unique state xa G Sti (A) that is invariant under re- 
versible transformations, that is, a unique state such that 
^XA = XA for every f eG A : 

Lemma 10 (Uniqueness of the invariant state) 

For every system A, there is a unique state xa invari- 
ant under all reversible transformations in Ga- The 
invariant state has the following properties: 

1. xa is completely mixed 

2- XAB =XA®XB- 

Proof. See Corollary 34 and Theorem 4 of Ref. |^l| . 
The proof of item 2 uses the local distinguishability ax- 
iom. ■ 

When there is no ambiguity we will drop the subindex 
A and simply write x- 

The uniqueness of purification in postulate [l] requires 
that if \£ P ,^' p G Sti(AB) are two purifications of p G 
Sti (A), then there exists a reversible transformation 
■?/ G G B such that ^' p = (J K <g> <%)if! p . The following 
lemma extends the uniqueness property to purifications 
with different purifying systems: 
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Lemma 11 (Uniqueness of the purification up to 
channels on the purifying systems) Let ^> £ St^AB) 
and ty' £ Sti(AC) be two purifications of p £ Sti(A). 
Then there exists a channel "jf £ Transf(B, C) such that 



A 


B 




C 







Proof. See Lemma 21 of Ref. ■ 

Another consequence of the uniqueness of purification 
is the fact that any ensemble decomposition of a given 
mixed state can be obtained by performing a measure- 
ment on the purifying system: 

Lemma 12 (Purification of preparation-tests) Let 

p £ Sti(A) be a state and $ p £ Sti(AB) be a purification 
of p. //{pijigx be a preparation-test such that X^iex Pi = 
p, then there exists an observation-test {a;}.; 6 x on the 
purifying system such that 



{E) 



Proof. See lemma 8 of Ref. |2J. ■ 

An easy consequence is the following: 

Corollary 1 If & p £ Sti(AB) is a purification of p £ 
Sti(A) and a belongs to the face F p , then there exists an 
effect b and a non-zero probability p > such that 



pQD- 



An important consequence of purification and local dis- 
tinguishability is the relation between equality upon in- 
put of p and equality on the purifications of p: 

Theorem 1 (Equality upon input of p vs equality 
on purifications of p) Let $ £ Sti(AC) be a purifica- 
tion of p £ Sti(A) ; and let si, si' £ Transf(A,B) be two 
transformations. Then one has 



{si <E> Jcf^lp = {si' ® Sc)®,. 



si =„ si' 



Proof. See theorem 1 of Ref. |2lj. The proof of the 
direction uses the local distinguishability axiom. ■ 
As a consequence, the purification of a completely 
mixed state allows for the tomography of transforma- 
tions: 

Corollary 2 Let u> £ Sti(A) be completely mixed and 
£ Sti(AC) is a purification of u. Then, for all trans- 
formations si, si' £ Transf(A, B) one has 



{si <g> = {si' ® J?c)^u 



si = si'. 



Proof. By theorem [l] the first condition is equivalent 
to si = u si'. Since u is completely mixed, this means 
si a = si' a for every a £ Sti(A). By local distinguisha- 
bility [see Eq. (||)] this implies si = si' . ■ 

Corollary || shows that the state {si ® ^c)^u> charac- 
terizes the transformation si completely. We will express 
this fact by saying that the state "f^ is dynamically faith- 
ful [ pp| , or just faithful, for short. Using this notion we 
can rephrase corollary ^| as: 

Corollary 3 If \1/ £ Sti(AC) is pure and its marginal 
on system A is completely mixed, then ^ is dynamically 
faithful for system A. 

Let us choose a fixed faithful state for system A, say 
^ £ Sti(AC). Then for every transformation £ 
Transf(A,B) we can define the Choi state R<g £ St(BC) 
as 



We then have the following: 

Theorem 2 (Choi isomorphism) For a given faithful 
state * £ Sti(AC) the map c £ H> := ^€ ® Jq)^ has 
the following properties: 

1. it defines a bijective correspondence between tests 
{■^i}iex from A to H and collections of states 
{i?i}i 6 x for BC satisfying 



BC 



(e\ A m 



ac ■ 



2. the transformation is atomic if and only if the 
corresponding state R<g is pure. 

Proof. Sec Theorem 17 of Ref. 

A simple consequence of the Choi isomorphism is the 
following: 

Corollary 4 Let {^i}iex C Transf(A,B) be a collection 
of transformations. Then, {^}iex is a test if and only 
if 



iex 



In particular, let {aj}j g x C Eff(A) be a collection of ef- 
fects. Then, {a;}.; 6 x is an observation-test if and only 
if 



(8) 



iex 



Proof. Apply item 1 of theorem || to the collection of 
states {Ri} ie x defined by R4 := {% <g> J^c)*- ■ 

A much deeper consequence of the Choi isomorphism 
is the following theorem: 
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Theorem 3 (States specify the theory) Let 0,0' be 
two theories satisfying the purification postulate. If O 
and Q' have the same sets of normalized states, then 0' = 

e. 



Proof. See Theorem 19 of Ref.|pi[. ■ 

Thanks to theorem || to derive quantum theory we will 
only need to prove that our principles imply that for ev- 
ery system A the normalized states Sti(A) can be de- 
scribed as positive Hcrmitian matrices with unit trace. 
Once this is proved, theorem || automatically ensures that 
all the dynamics and all the measurements allowed by the 
theory are exactly the dynamics and the measurements 
allowed in quantum theory. 

Note that in the definition of the Choi state we left 
the freedom to choose the faithful state \P G Sti(AC). 
Among many possibilities, one convenient choice is to 
take a faithful state G Sti(AC) obtained as a purifica- 
tion of the invariant state x € Sti(A). Moreover, as we 
will see in the next paragraph, we can always choose the 
purifying system C in such a way that the marginal on 
C is completely mixed. 



C. Results about the combination of compression 
and purification 

An important consequence of the combination of the 
purification postulate with the compression axiom is the 
fact that one can always choose a purification of p such 
that the marginal state on the purifying system is com- 
pletely mixed. To prove this result we need the following 
lemma: 

Lemma 13 Let p G Sti(A) be a state and let ^ p G 
Sti(AB) be a purification of p. If $ G Transf(A, C) is 

the encoding operation in the compression scheme of ax- 
iom [|, then the state := {$ ® J^b)^p is pure. 

Proof. Let 3 G Transf(C,A) be the decoding opera- 
tion. Since the compression is lossless for p we know 
that QiS = p J^a- By theorem [l] this is equivalent to 
the condition (W ® ^b)^ P = *p- Now, suppose that 



then obtain <!' 



J2iex^i- Applying @ on both sides we 
p = £iex(^ ® ^b)I\, and, since * p is 
pure, for every i G X we must have (2 ® J^b)^i = 
Pi^p, where pi > is some probability. Finally, since 
= J* c (lemma |), one has Ti = piiS ® J?b)^ p - 
Hence, (S ® J'b)^ p admits only decompositions with 
r 4 = pi{S <E> J^B^p, that is, {£ ® Jb)^ P is pure. ■ 
We are now in position to prove the desired result: 

Theorem 4 For every state p G Sti(A) there exists a 
system C and a purification ty p G Sti(AC) of p such 
that the marginal state on system C is completely mixed. 
Moreover, the system C is unique up to operational equiv- 
alence. 



Proof. Take an arbitrary purification of p, say $ p G 
Sti(AB) for some purifying system B. Define the 
marginal state on system B as \6)b ■— {e\A\& P )AB and de- 



fine the state "J, 



(J^A®^)^, where 



the encoding operation for state 



G Transf (B, C) 
By Lemma [13| we 



ities one obtains ^ 



know that "J p G St(AC) is pure. Using lemma and the 
oremg we obtain (e c | = [(e C K] \$„) = (e B | = 
\p), that is, ^ p is a purification of p. Finally, the marginal 
on system C is given by p = <S0, which by Lemma 
H is completely mixed. This proves the first part of 
the thesis. It remains to show that the system C is 
uniquely defined up to operational equivalence. Suppose 
that ^f' p G St(AC') is another purification of p with the 
property that the marginal on system C is completely 
mixed. Since ^ p and \f p are two purifications of the 
same state, there must be two channels ( £ G Transf (C, C) 
and 3S G Transf(C, C) such that ^' p = (J? A ® P and 
{,^A®S^)^' P (lemma pi]). Combining the two equal- 
{Jk®^ c &)^> P - Now, the marginal 
of ^ p on system C is completely mixed, and this implies 
that 'J'p is faithful for system C (corollary^). Hence, we 
have = J"c- Repeating the same argument for ty' 
we obtain = J^c 1 - Therefore, ^ is reversible and 
ffl = C €~ Y . This proves that C and C are operationally 
equivalent. ■ 

The following facts will also be useful 

Corollary 5 Let fy p G Sti(AB) be a purification of p G 
Sti(A) and let £ G Transf (A, C) be the encoding for p. 
Then, the state (£ ® J?b)^ p G Sti(CB) is dynamically 
faithful for C . 



Proof. The marginal of (£ ® ^b)^ p on system C is 
§ p, which is completely mixed by lemma |^. Hence, (S ® 
■J?b)^ p is dynamically faithful by corollary [| ■ 

Lemma 14 The decoding transformation £F G 
Transf(C, A) in the ideal compression for p G Sti(A) is 
atomic. 



Proof. Let 'J'p G Sti(AB) be a purification of p, for some 
purifying system B. Since 2$ = p ,J? A (the compression 
is lossless), we have {&S ® ^b)\^ P ) = \^ P ) (theorem 
0). Now, by corollary § (S ® J?b)\^ P ) is faithful for C 



and by lemma 13 ($ ® ^b)\^ P ) is pure. Using the Choi 



isomorphism with the faithful state $ := [S £ 
then obtain that $ is atomic. ■ 



we 



D. Teleportation and the link product 

For every system A one can choose a completely mixed 
state ua and a purification \I/( A ) G St(AA) such that the 
marginal on system A is completely mixed (cf. theorem 
|]). Any such purification allows for a probabilistic tele- 
portation scheme: 
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Lemma 15 (Probabilistic teleportation) There ex- 
ists an atomic effect E^ E Eff(AA) and a non-zero 
probability p A such that 



$(A) 



PA 



J 



= PA 



Proof. See Corollary 19 of Ref. |2J. ■ 

Let us choose v]/( A ) to be the faithful state in the def- 
inition of the Choi isomorphism. Then the sequential 
composition of transformation induces a composition of 
Choi states in following way: 

Corollary 6 (Link product) For two transformations 
c £ G Transf(A,B) and G Transf(B, C) the Choi state of 
'S'lo £ Transf(A, C) is given by the link product 



1 

PB 



(9) 



Proof. See Corollary 22 of Ref. |2J. ■ 

We conclude this paragraph with an important result 
that follows from the combination of the link product 
structure with the pure conditioning axiom: 

Lemma 16 (Atomicity of composition) The com- 
position of two atomic transformations is atomic. 



Proof. Let G Transf(A,B) and 9 G Transf(B,C) be 
two atomic transformations. By the Choi isomorphism, 
the (unnormalized) states R<g and i?@ are pure. Since 
the teleportation effect E^ in Eq. (0) is atomic (lemma 
|l5| ), the pure conditioning axiom [5] implies the state i?@<^ 
is pure. By the Choi isomorphism this means that l 2> c € 
is atomic. I 



E. No information without disturbance 

We say that a test {^}iex C Transf(A) is non- 
disturbing upon input of p if X^ex^ = p '^a- If p is 
completely mixed, we simply say that the test is non- 
disturbing. 

A consequence of the purification postulate is the fol- 
lowing "no-information without disturbance" result: 



Lemma 17 (No information without disturbance) 

A test {^i}iex C Transf(A) is non- disturbing upon input 
of p if and only if there is a set of probabilities {pi}i e x 
such that c toi = p Pi^A for every i G X. 

Proof. Sec Theorem 10 of Ref. ||. ■ 

The no-information without disturbance result implies 
the following geometrical limitation 

Corollary 7 For every system A the convex set of states 
Sti(A) is not a segment. 

Proof. The proof is by contradiction. Suppose that for 
some system A the set Sti(A) is a segment. The segment 
has only two pure states, say ipi and tf2, and every other 
state p G Sti(A) is completely mixed. Then the distin- 
guishability axiom ^| imposes that ipi and tp2 are perfectly 
distinguishable. Take the binary test {01,02} such that 
(a, \<fj) = Sij and define the "measure-and-prepare" test 
{^1,^2} as ffi = \tpi) (a,i\, i = 1,2 (the possibility of 
preparing a state depending on the outcome of a previ- 
ous measurement is guaranteed by causality |^|). Since 
every state p in the segment can be written as convex 
combination of the two extreme points, we have that the 
test {^1, ^2} is non-disturbing: {f€\ + < £ 2 )p = P for every 
p. This is in contradiction with lemma [17] because ^1 
and c $2 are not proportional to the identity. I 

We know that no information can be extracted with- 
out disturbance. In the following we will prove a result 
in the converse direction: if a measurement extracts no 
information, than it can be realized in a non-disturbing 
fashion. To show this result we first need the following 

Lemma 18 For every observation test {cii}igx C Eff(A) 
with finite outcome set X there is a system C and a test 
{£^i}itzx C Transf(A, C) consisting of atomic transforma- 
tions such that (ai\ — (ec| hi- 



proof. Let I'IOab be a pure faithful state for system A 
and let \Ri) B = (a»| A I^ab tne Choi state of a,;. Take a 
purification of Ri, say I^^bc f° r some purifying system 
C j38|. Then, by the Choi isomorphism there is a test 
{g/i}i£X-, with input A and output C, such that 



[see item 1 of theorem Moreover, each transformation 
s&i : A — > C is atomic (item 2 of theorem ||). Applying 
the deterministic effect (ec| on both sides we then obtain 
\R*)b = (ecll^)cA = (ec|M|*)AB- % definition of 
Ri, this implies (o*| A |*) AB = (ec| M I*)ab' and ' sincc 
^ is dynamically faithful, (<ii| A = (ec| B 

Theorem 5 Let p G Sti(A) be a state, a G Eff(A) be an 

effect, and si € Transf(A,B) be an atomic transforma- 
tion such that (a| A = (e\ B £/. If (a| — p p(e| for some 
p > 0, then there exists a channel ^ G Transf(B, A) such 
that 'tfsrf — p pJ"A ■ 
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Proof. Consider a purification of p, say 'Fp € St(AC), 
and define the state £ g Sti(BC) by |S) := ±(V 
^c)|^p)- By the atomicity of composition 16 the state 
£ is pure. Moreover, we have 



(e B ||s; 



BC 



i(a| A |* P )AB 
(e A ||* P )AC, 



having used theorem |l] in the last equality. This implies 
that "J p and E are different purifications of the same 
mixed state on system C. Then, by lemma [ll] there 
exists a channel "S? <E Transf(B,A) such that |\Pp) = 
CSf ® ^c)|E) = (g) ^c)|*p)- By theorem fjj, the 

last equality implies 'tfsrf = p pJ^A- ■ 

We now make a simple observation that combined with 
Theorem ^ will lead to some interesting consequences: 

Lemma 19 If (a\p) = \\a\\, then a = p \\a\\e. Similarly, 
if (a\p) = 0, then a = p 0. 

Proof. By definition, a £ F p iff there exists p > and 
r € Sti(A) such that p = pa + (1 — p)r. If (a\p) = ||a||, 
then we have ||a|| = p(a|er) + (1 — _p) (a|r). Since (a|a) 
and (a|r) cannot be larger than |ja||, the only way to 
have the equality is to have (a\a) = (a\r) = \\a\\. By 
definition, this amounts to say a = p \\a\\e. Similarly, if 
(a\ p) = 0, one has = p (a\ a) + (1 — p) (a\ r), which is 
satisfied only if (a\ a) = (a\ r) = 0, that is, if a = p 0. ■ 
As consequence, we have the following: 

Corollary 8 Let p £ Sti(A) be a state, a £ Eff(A) be an 

effect, and srf £ Transf(A,B) be an atomic transforma- 
tion such that (a| A = (e\^,si . If (a\p) = 1, then srf is 
correctable upon input of p, that is, there exists a correc- 
tion operation c (o € Transf(B, A) such that ffs/ — p J 



Proof. If (a\ p) : 

then implies (a\ - 
obtain the thesis. 



1, then clearly ||a|| = 1. Lemma pj| 
, (e|. Applying theorem |E] we finally 



Corollary 9 Let p £ Sti(A) be a state, a £ Eff(A) be an 

effect such that (a\p) = 1. Then there exists a trans- 
formation 'io £ Transf(A) such that (a\ = (e\^ and 

C i=o J. 



Proof. Straightforward consequence of lemma 18 and of 
corollary ||. ■ 

Finally, we say that an observation-test {a^jigx is non- 
informative upon input of p if we have {ai\ = p pi (e| for 
every i£X. This means that the test {a^jigx is unable 
to distinguish the states in the face F p . As a consequence 
of theorem ^| we have the following "no disturbance with- 
out information" result: 

Corollary 10 (No disturbance without information) 

If the test {aijigx is non-informative upon input of 
p then there is a test {^i}igx C Transf(A) that is 
non- disturbing upon input of p and satisfies (e| 3>i = (cij| 
for every i £ X. 



Proof. By lemma [18] there exists a test {s/i} C 
Transf(A,B) such that each transformation sii is atomic 
and (e\s/i = (o»|. By theorem |^, for each srfi there is a 
correction channel ^ such that = p PiJ^A- Defining 
% := %srf; we then obtain the thesis. ■ 



V. PERFECTLY DISTINGUISHABLE STATES 

In this section we prove some basic facts about per- 
fectly distinguishable states. Let us start from the defi- 
nition: 

Definition 5 (Perfectly distinguishable states) 

The normalized states {pi}f = i Q Sti(A) are perfectly 
distinguishable if there exists an observation-test {a.i}fL 1 
such that (a,j\pi) = 6ij. The observation-test {ai}fL 1 is 
called perfectly distinguishing. 

From the distinguishability axiom || it is clear that ev- 
ery nontrivial system has at least two perfectly distin- 
guishable states: 

Lemma 20 For every nontrivial system A there are at 
least two perfectly distinguishable states. 

Proof. Let tp be a pure state of A. Obviously, tp is 
not completely mixed (unless the system A has only one 
state, that is, unless A is trivial). Hence, by axiom | the re 
exists at least a state a that is perfectly distinguishable 
from tp. ■ 

An equivalent condition for perfect distinguishability 
is the following: 

Lemma 21 The states {pi}^L 1 C Sti(A) are perfectly 
distinguishable if and only if there exists an observation- 
test {ai} 1 jL 1 such that («i|pi) = 1 for every i. 



Proof. The condition (ai\pi) = 1, Vi = l,...,N 
is clearly necessary. On the other hand, the condition 
(a,i\pi) = 1, Vi = 1, . . . , N implies 

N 

( ai \pi) = 1 = ^2(aj\pi) = {a % \ Pi ) + ^2{a 3 \pi). 

Since all probabilities are non-negative, we must have 
(aj\pi) = for i ^ j, and therefore, (aj\pi) = Sij. ■ 

A very general fact about state discrimination is ex- 
pressed by the following: 

Lemma 22 If p is perfectly distinguishable from a and 
p' ( resp. a' ) belongs to the face identified by p ( resp. a ), 
then p' is perfectly distinguishable from a' . 

Proof. Let {a,e — a} be the binary observation-test that 
distinguishes perfectly between p and a. By definition, 
a £ Eff(A) is such that (a\p) = 1 and (a\a) = 0. Now, by 
lemma Q9l (a\p') = 1 and (a\a') = for all p' £ F p and 
a' £ F a M 
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Thanks to purification and to the local distinguisha- 
bility axiom [4|, we are also in position to show a much 
stronger result: 



Lemma 24 Every set of perfectly distinguishable pure 
states can be extended to a maximal set of perfectly dis- 
tinguishable pure states. 



Lemma 23 Let {pi}f =1 C F p and {pj}fJ}^ +1 C F a be 
two sets of perfectly distinguishable states. If p is per- 
fectly distinguishable from a, then the states {pi}^J^ M 
are perfectly distinguishable. 



Proof. Let {a, ex — a] be the observation-test such that 
(a\ p) = 1 and (a\a) = 0. Now, by corollary || there is 
a transformation <G Transf(A) such that (ca|^ = (a\ 
and if = p Similarly, there exists a transformation 

if' G Transf(A) such that (e A | if' = (e A | - (a| and if' = a 
We can then define the following observation-test 



(Ci\ 



(a,|if i<N 

(&i|if' N + l<i<N + M 



where {ai}f =1 (resp. jy+i) is the observation-test 

that perfectly distinguishes among the states {pi}^L 1 
(resp. {pj}jJi^i)- By corollary |] [see in particular Eq. 

®]i { c i}i=i M i s indeed an observation-test: each Cj is an 
effect and one has the normalization 



N+M 



N 



N+M 



£ (c i \ = y £(a i \V+ J2 W 

i=l i=l i=N+l 

= (e A |if +(e A |if' 

= (a\ + (e A | - (a\ = (e A | . 

Moreover, since ^ = p J^a and if' = CT J<a, one has 
(c«| Pi) = 1 f° r every i = 1, . . . , M + N. By lemma |2l], 
this implies that the states {pi}^j[ M are perfectly distin- 
guishable. ■ 

Definition 6 A set of perfectly distinguishable states 
{pi]f = i is maximal if there is no state pn+i G Sti(A) 
such that the states {pt} 1 ^ 1 are perfectly distinguishable 

Theorem 6 A set of perfectly distinguishable states 
{pi]f = i is maximal if and only if the state oj = 
Y^iLiPi/N is completely mixed. 



Proof. Let {vijfli be a non-maximal set of perfectly 
distinguishable pure states. By definition, there exists 
a state a such that {<Pi}f =1 U {a} is perfectly distin- 
guishable. Let ifN+i be a pure state in F a . By Lemma 
|n| the states {fi}^ 1 will be perfectly distinguishable. 
Since the dimension of St.r(A) is finite and distinguish- 
able states arc linearly independent, iterating this proce- 
dure one finally obtains a maximal set of pure states in 
a finite number of steps. I 

Corollary 11 Any pure state belongs to a maximal set 
of perfectly distinguishable pure states. 

We conclude this section with a few elementary facts 
about how the ideal compression of axiom || preserves 
the distinguishability properties. In the following we will 
choose a state p G Sti(A) and S G Transf (A, C) (resp. 
'S G Transf (C, A)) will be the encoding (resp. decoding) 
in the ideal compression scheme for p. 

Lemma 25 If the states {/3;}i=i C F p are perfectly 
distinguishable, then the states {<?Pi},f =1 C Sti(C) 
are perfectly distinguishable. Conversely, if the states 
{<Tj}|_ 1 C Sti(C) are perfectly distinguishable, then the 
states {^<7i}i = i C F p are perfectly distinguishable. 



Proof. Let {ai}^ =1 be the observation-test such that 
(ij|pi) = 1 for every i = 1, . . . , k. Since the compression 
is lossless, we have *3l$\pi) = \pi) and (a,i\3)§\pi) = 1. 
Now, consider the test {cj}£_j defined by (c,| = (ai\S>. 
Clearly we have (a\S'\pi) = 1 for every i — 1, ...,k. 
By lemma |2l] this means that the states {<^Pi}.f =1 are 
perfectly distinguishable. Similarly, let the 
observation-test that distinguishes the set {<x;}k =1 . Since 
£&t = J"q (lemma |[), we can conclude by the same argu- 
ment that the states {3><Ji\\ =1 are perfectly distinguish- 
able. ■ 

We say that a set of perfectly distinguishable states 
{Pi}i=i G F p is maximal in the face F p if there is no 
state Pk+i G F p such that the states {pOi=a are perfectly 
distinguishable. We then have the following: 



Proof. We first prove that if cj is completely mixed, then Corollary 12 If {p»}i=i C F p is a maximal set of 
the set {pijili must be maximal. Indeed, if there existed perfectly distinguishable states in the face F p , then 

{$pi\i = \ G Sti(C) is a maximal set of perfectly distin- 
guishable states. Conversely, if {t7i}f =1 Sti(C) is a maxi- 
mal set of perfectly distinguishable states, then {&ai]^ =l 
is a maximal set of perfectly distinguishable states in the 
face Fp. 



a state pn+i such that {pt} 1 ^ 1 are perfectly distinguish- 
able, then clearly pn+i would be distinguishable from ui. 
This is absurd because by proposition |lj no state can be 
perfectly distinguished from a completely mixed state. 
Conversely, if {pj}i^=i i s maximal, then ui is completely 
mixed. If it were not, by the distinguishability axiom |^, ui 
would be perfectly distinguishable from some state pn+i- 

N+l 



By lemma |23|, this would imply that the states {pi}l 
are perfectly distinguishable, in contradiction with the 
hypothesis that the set {pi}fLi is maximal. ■ 



Proof. Distinguishability of the states {£pi}^ =1 and 
{Slcri}i =1 is proved by lemma p5|. Let us now prove max- 
imally. By contradiction, suppose that the set {pi}!? =1 is 
maximal in the face F p while the set {(7i}* =1 , ai := S pi 



18 



is not maximal. This means that there exists a state 
(Jk+i € St^C) such that the states {cr,-}^* are perfectly 
distinguishable. By lemma ^B] the states {S^o^^l are 
perfectly distinguishable. Since PSlSpi = pi for every 
i = 1, . . . , k, this means that the states {pi}i =1 U{£)o-k+i} 
are perfectly distinguishable, in contradiction with the 
fact that {pi}i =1 is maximal. This proves that the 
set {<5Pi}i=i must be maximal. Conversely, if the set 
{<Xi} C Sti(C) is maximal, using the same argument we 
can prove that the set {£>o~i}i =1 must be maximal in F p . 



VI. DUALITY BETWEEN PURE STATES AND 
ATOMIC EFFECTS 

We now show the existence of a one-to-one correspon- 
dence between states and effects of any system A in the 
theory. Let us start from a simple observation: 



Lemma 26 // a is atomic and (a\p) 
Sti(A), then p must be pure. 



for p G 



Proof. By lemma [19[ the condition (a\p) = \\a\\ implies 
a = p \\a\\ e. By theorem [l], the condition a — p \\a\\ e 
implies 



-CD 



where ^ p G Sti(AB) is any purification of p. Since a 
is atomic, the pure conditioning axiom ^| implies that 
the marginal state \p) B = (e| A |^'p) AB is pure. Since 
the marginal of Vfp on system B is pure, *S> p must be 
factorized, i.e. ty p = p® p (see lemma 19 of Rcf. [l]). 
Hence, p must be pure, otherwise we would have a non- 
trivial convex decomposition of the pure state ^ p . I 

We are now in position to show that every atomic effect 
is associated to a unique pure state. 

Theorem 7 For every atomic effect a e Eff(A), there 
exists a unique pure state tp £ Sti(A) such that (a\<p) = 



Proof. Let p be a state such that (a\p) = \\a\\. By 
lemma |2^, p must be pure. Moreover, this pure state 
must be unique: suppose that <p and <p' are pure states 
such that (a\tp) = (a\tp') = \\a\\. Then for u> = l/2(tp + 
tp') one has (a|w) = ||a||. Since oj must be pure, one has 
ip = <p'. ■ 

We now show the converse result: for every pure state 
tp € Sti(A) there exists a unique atomic effect a such 
that (a\ ip) = 1. Let us start from the existence: 

Lemma 27 Let {tpijfLi C Sti(A) be a maximal set of 
perfectly distinguishable pure states and let {a/\f =l be the 



observation-test such that (aj| tpj) 
Oi is atomic with II a* II = 1. 



Then each effect 



Proof. It is obvious that ||aj|| = 1, because of the condi- 
tion (a,i\tpi) = 1. It remains to prove atomicity. Consider 
the state w = Yli=i fi/N, which is completely mixed by 
theorem ||. Let vf^ S Sti(AB) be a purification of u>, 
chosen in such a way that the marginal on system B 
is completely mixed (theorem ||). As a consequence of 
purification (lemma |l^) , there exists an observation-test 
{h}iLi on system B such that (&j| B |* W ) AB = l / N \<Pi)x- 
Since Vt^ is dynamically faithful on system B, each effect 
bi must be atomic. Now, define the normalized states 
{Pi}iLi C Sti(B) and the probabilities {pi}fLi by 



■{a£) 



Pi 



(10) 



Applying the deterministic effect ee on both sides one 
has Pi = (cii\uj) = 1/N. On the other hand, applying the 
effect bj one has instead 1/N (bj\ pi) B = 1/N (ai\ipj) = 
Sij/N. This implies (bi\pi) = 1 for every i. Since bi 
is atomic, lemma 26 forces each pi to be pure. Finally, 
each ai must be atomic since its Choi state pi \pi) B = 
(a* I A I*^)ab is P urc (theorem g). ■ 

As a consequence, we can prove the following existence 
result: 

Lemma 28 For every pure state tp € Sti(A) there exists 
an atomic effect such that (a\(p) = 1. 

Proof. By corollary [ll], every pure state belongs to 
a maximal set of perfectly distinguishable pure states 
{(pi}fLii say (p = ip\. The thesis then follows from lemma 
g ■ 

We now prove that the atomic effect a such that 
(a\cp) = 1 is unique. For this purpose we need two auxil- 
iary lemmas: 

Lemma 29 Let tp € Sti(A) be an arbitrary pure state 
and let p v be the probability defined by 



p v = max {p : 3a, \ = pip + (1 - p)a] 



111) 



where x * s t ne invariant state of system A. Then the 
value of the probability p v is independent of tp. 

Proof. Since for every couple of purc states ip and ip one 
has tp — 'W ip for some reversible channel (lemma |^), 
and since \ is invariant, one has x = (1 ~p) a if an d 
only if x = P^P + (1 — p)^tr. The maximum probabilities 
for ip and ip are then equal. I 

Since p v = p.0 for every couple of pure states, from 
now on we will write p max in place of p v . 

Lemma 30 Let tp € Sti(A) be a pure state and a £ 
Eff(A) be an atomic effect such that (a\ tp) = 1. Let | < f ) ) AB 
be a purification of the invariant state \x)a> chosen in 
such a way that the marginal on system B is completely 
mixed, and let b be the unique atomic effect on B such 
that 



= Pn 



(12) 
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[note that b exists by lemma |7^zs uniquely defined by Eq. 
because $ is faithful for system B ]. Then one has 



= Pn 



(13) 



where ip is the unique pure state such that (b\ip) = 1. 

Proof. Define the normalized pure state tp and the prob- 
ability q by 



(14) 



In order to prove the thesis we have to show that q = p max 
and (b\ ip) = 1. Applying 6 on both sides of Eq. (|T^) and 
using Eq. @ we obtain g(o|V>) = PmaxHv) = Pmai- 
This implies 

max- 

(15) 

with the equality if and only if {b\ip) = 1. Let b' be an 
atomic effect such that (b'\ip) = 1 (such an effect exists 
because of lemma 28 ). Define the normalized pure state 



ip' and the probability p' by 



P 



Applying a on both sides and using Eq. (|l4J) we obtain 
p' (a\ip') = q(b'\ijj) = q, which implies p' > q, with the 
equality if and only if (a\tp') = 1. Combining this with 
the inequality ([150 w e have p 1 > q > p niax . On the other 
hand, by Lemmap9| one has p' < p max , and consequently 
p' = q = p m ax- This also implies that (b\ip) = 1 and 
{aW) = lM 

Theorem 8 For every pure state ip G Sti(A) there is a 
unique atomic effect a G Eff(A) such that (a\ip) = 1. 

Proof. Existence has been already proved in lemma 
p8| . Let us prove uniqueness: suppose that a and a' are 
two atomic effects such that (a\ip) — (a'\ip) = 1. Then, 
applying lemma tKJ to a and a' we obtain 



-QD 



-LZ) 



Since $ is dynamically faithful, this implies a = a'. ■ 
Finally, an important consequence of theorem || is 

Corollary 13 If a, a' G Eff (A) are iwo atomic effects 
with \\a\\ = 1 1 a' 1 1 = 1, then there is a reversible channel 
<% G G A such that (a'\ A = {a\ A W . 

Proof. Let ip and ip' be the (unique) normalized states 
such that (a\ip) = 1 and (a'\ip') = 1, respectively. 
Now, there is a reversible channel % G Ga such that 
\<p) A = W\ip') A . Hence, (oV) = (o|v)a = W^l^)- 
By theorem |^, one has (a'| A = (a\ A % ' ■ B 

We conclude this section with an elementary result 
that will be used later in the paper: 



Lemma 31 Let S G Transf (A, C) and 3) G Transf(C, A) 

be the encoding and the decoding in the ideal compression 
scheme for p G Sti(A). If \ip) G F p is a pure state and 
(a | G Eff(A) is the atomic effect such that (a| ip) = 1, then 
1 7) := <§ \ip) G Sti(C) is a pure state and (c| := (a| ^ G 
Eff(C) is the atomic effect such that (c\~f) = 1. 



Proof. The state I7) := § \ip) is pure by lemma 
effect (c\ := (a\ @ is atomic by lemmas |IJ and 
S>S = p J^a, one has {c\i) = [a\S>S\ip) = (a\tp)~l. 



ra | . r 
|§ Si 
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VII. DIMENSION 

In this section we show that each system in our the- 
ory has given informational dimension, defined as the 
maximum number of perfectly distinguishable pure states 
available in the system. In the Hilbcrt space framework, 
the informational dimension will be the dimension of the 
Hilbcrt space. 

Lemma 32 All maximal sets of perfectly distinguishable 
pure states have the same number of elements. 

Proof. Let {</?,• J-^Lj be a maximal set of perfectly dis- 
tinguishable pure states for system A, and let {ai}^L 1 
the observation-test such that (ai\ipj) = 5ij. By lemma 
^?], each a% is atomic and ||a,|| = 1. Then, by corol- 
lary |l3|, one has (a.;| A = (ao| where each % is a 
reversible channel and ao is a fixed atomic effect with 
|| Go || = 1- By the invariance of x wc then obtain 
(ai\xA) = (oo|%|xa) = (oo|xa)- On the other hand, one 
has ( a i\XA) = 1, which implies N = 1/(o |xa)- 

Since ao is arbitrary, N is independent of the choice of 
the set {<Pi}iL v ■ 

As a consequence, the number of perfectly distinguish- 
able pure states in a maximal set is a property of the 
system A. We will call this number the informational 
dimension (or simply the dimension) of system A, and 
denote it with d A . The informational dimension dx has 
not to be confused with the size D A of the state space 
St (A): recall that D A was defined as the dimension of 
the real vector space Sta(A). In quantum theory one has 
D A = d\. 

An immediate consequence of the proof of lemma B2] is 



Corollary 14 For every atomic effect a with \\a\\ = 1 
one has (<x|xa) = 1/^a- 

This simple fact has two very important consequences. 
The first is that the dimension of a composite system is 
the product of the dimensions of the components: 

Corollary 15 The dimension of the composite system 
AB is the product of the dimensions of A and B, namely 
Oab = oa Ob- 
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Proof. From lemma |lC] we know that xa ® Xb is the 
unique invariant state of system AB. Now, if a G Eff(A) 
and b G Eff(B) are such that ||a|| = ||6|| = 1, then a <Ei 
b is such that \\a <g> 6|| = 1. Hence we have 1/oab = 
(a®6|xA®XB) = (o|xa)(6|Xb) = l/(<k<fe)- ■ 

The second consequence is the relation between the 
dimension and the maximum probability of a pure state 
in the convex decomposition of the invariant state |x)a : 

Lemma 33 For every system A, the maximum proba- 
bility of a pure state in the convex decomposition of the 
invariant state is p m ax = 

Proof. Let $ G Sti (AB) be a purification of the invari- 
ant state |xa)i chosen in such a way that the marginal 
on system B is completely mixed. Let a G Eff(A) be an 
atomic effect with |ja|| = 1. Then, equation (|l3| ) becomes 



-CD 



Pn 



ay 



where ip is some normalized pure state of system B. Ap- 
plying the deterministic effect e on system B on both 
sides we obtain (o|xa) = Pmax- Finally, corollary |lj 
states (o\xa) = 1/^a- By comparison, we obtain 

Pmax — 

l/d A . U 

Thanks to the compression axiom ^|, the notion of di- 
mension can be applied not only to the whole state space 
Sti(A) but also to its faces. With face F of the convex 
set Sti (A) we always mean the face F p identified by some 
state p G Sti (A). 

Lemma 34 Let F be a face of the convex set Sti (A). Ev- 
ery maximal set {vilfLi of perfectly distinguishable pure 
states in F has the same cardinality k. Precisely, if F is 
the face identified by p G Sti (A) and <§ G Transf(A, C) is 
the encoding in the ideal compression for p, then we have 
k = d c 



Proof. The set {S'lpi 
guishable by lemma &l 



=1 C Sti(C) is perfectly distin- 
and it is maximal by corollary 



12. Moreover, the states {<£Vi}i=i are pure by lemma 
|T Hence, the cardinality k of the set {(fi}i =1 must be 
k = d c .M 

From now on the maximum number of perfectly distin- 
guishable states in the face F will be called the dimension 
of the face F and will be denoted by \F\. 



VIII. DECOMPOSITION INTO PERFECTLY 
DISTINGUISHABLE PURE STATES 

In this section we show that in a theory satisfying our 
principles any state can be written as a convex combina- 
tion of perfectly distinguishable pure states. In quantum 
theory, this corresponds to the diagonalization of the den- 
sity matrix. 

To prove this result we need first a sufficient condition 
for the distinguishability of states, given in the following 



Lemma 35 Let {pi}fL 1 C Sti (A) be a set of states. If 
there exists a set of effects {bt]f =1 C Eff(A) (not neces- 
sarily an observation-test) such that ibi\pj) = Sij , then 
the states {pi}fLi are perfectly distinguishable. 



Proof. For each i = 1, 



{b t , e - bi}. Since by hypothesis (bi\pj) — u ij 



N consider the binary test 
Sij, the test 

{bi, e — bi} can perfectly distinguish pi from any mixture 
of the states {pj}j^i- In particular, this means that, for 
every M < N, Pm+i can be perfectly distinguished from 
the mixture ujm = Pj/M . Note that, by defini- 

tion, the states {pi}f£i belong to the face F UM . We now 
prove by induction on M that the states {pi]fL 1 are per- 
fectly distinguishable. This is true for M = 1. Now, 
suppose that the states {pi\fL\ are perfectly distinguish- 
able. Since the state pm+\ is perfectly distinguishable 
from u>m> by lemma |2^ we have that the states {pi} 1 ^ 1 
are perfectly distinguishable. Taking M = N — 1 the 
thesis follows. ■ 

We now show that the invariant state x is a mixture 
of perfectly distinguishable pure states. 

Theorem 9 For every maximal set of perfectly distin- 
guishable pure states {vij^i C Sti (A) one has 



dA 



i=i 



Proof. Let {a.;}^ be the observation test such that 



( a i\<Pj) 



'•j ■ 



and $ G Sti(AB) be a purification of \i 



chosen in such a way that the marginal on system B is 
completely mixed (theorem^). Let {"0i}f*i C Sti(B) be 
the pure states defined by 



1 

dA 



B 



and, for each i, let bi be the atomic effect such that 

A 



4> 



1 

= -ri^ 

■Ma) dA 



(16) 



(here we used lemma [30] and the fact that p max = 1/cZa)- 
Then wc have 



■{aT) 



W) 



(17) 



By lemma [35|, this implies that the states {4 ! i}'i=i are per- 
fectly distinguishable. Now, since the marginal of I^ab 
on system B is completely mixed, theorem || states that 
the set {"0i}f=i i s maximal. Let {b^f^i the observation 



test such that (6 ■ | ipj ) 



By lemma EJJ, each b'i must 



be atomic. On the other hand, there is a unique atomic 
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effect bi such that (bi\ipi) — 1 (theorem ||). Therefore, 
b\ = bi. This means that the effects {6;}f=i form an ob- 
servation test. Once this fact has been proved, using Eq. 
( |l6| ) we obtain 

|x)a = ( e B||^)AB 
= E^N $ )ab 

i 



As a consequence, we have the following 

Corollary 16 (Existence of conjugate systems) 

For every system A there exists a system A, called the 
conjugate system, and a purification <I> G Sti(AA) of the 
invariant state xa such that d A = dA and the marginal 
on A is the invariant state xi- The conjugate system A 
is unique up to operational equivalence. 



Proof. We first prove that A is unique up to operational 
equivalence. The defining property of the conjugate sys- 
tem A is that the marginal of $ on A is the invariant state 
X A , which is completely mixed. Theorem |i] then implies 
that A is unique up to operational equivalence. Let us 
now show the existence of A. Take a purification of xa, 
with purifying system A chosen so that the marginal of 
$ on A is completely mixed (this is possible thanks to 
theorem g). Now, the states {V>»}f=i Q St(B), defined 
by ■= [(oi| <£> J^Wj are perfectly distinguishable 

[see Eq. ([l7]) in the proof of theorem Hence, by theo- 
rem H they are a maximal set of perfectly distinguishable 
pure states. This implies d A — d A - Finally, by theorem 

I one has l/d A J2i=i ipi = Xa- ■ 

Corollary 17 The distance between the invariant state 
Xa and an arbitrary pure state tp G Sti(A) is 



\x-v\\ 



2(d A - 1) 
dA 



Proof. Take a maximal set of perfectly distinguishable 
pure states such that tpi = tp (corollary 11). 

Since x = J2i=i ^/ d A one has x - 



((7 - If!), 

where a = X^i=2 Pi/(dA — !)■ Hence, one has ||x — <p\\ = 



_ 2(d A 



-g- — 1 1 a — y>x|| = v 2 A ~ ' having used that a and ip\ 
are perfectly distinguishable and therefore ||cr — =2 
(see subsection II-I in Ref. |2l|]). ■ 
We can now prove the following strong result: 

Theorem 10 (Spectral decomposition) For every 
system A, every mixed state can be written as a convex 
combination of perfectly distinguishable pure states. 



Proof. The proof is by induction on the dimension of 
the system. If d A = 1, the thesis trivially holds. Now 
suppose that the thesis holds for any system B with di- 
mension dB < A, and take a mixed state p G Sti(A) 
where d A = N + l. There are two possibilities: either (1) 
p is not completely mixed or (2) p is completely mixed. 
Suppose that (1) p is not completely mixed. Then by 
the compression axiom |^ one can encode it in a system 
C, using an encoding operation S G Transf(A, C). Now, 
the maximum number of perfectly distinguishable states 
in C is equal to the maximum number of perfectly dis- 
tinguishable states in the face F p (corollary |l2|) . Since p 
is not completely mixed, we must have dc < N. Using 
the induction hypothesis we then obtain that the state 
(op G Sti (C) is a mixture of perfectly distinguishable pure 
states, say £ p = '^2 i Piipi- Applying the decoding oper- 
ation @ G Transf(C,A) we get p = @£p = J2iPi^i- 
Since by lemmas || and 25 we know that the states 
{^i/ji}^ are pure and perfectly distinguishable, this is 
the desired decomposition for p. Now suppose that p is 
completely mixed (2). Consider the half-line in St.r(A) 
defined by cr t = (1 + t)p — tx, t > 0. Since the set of 
normalized states Sti(A) is compact, the line will cross 
its border at some point to- Therefore, one will have 



1 



l + *o 



Via 



l + t 



X- 



for some state a ta on the border of Sti(A), that is, for 
some state that is not completely mixed. But we know 
from the discussion of point (1) that the state o~t is 
a mixture of perfectly distinguishable pure states, say 
a t = ^2^=1 Pifi- By lemma ^J, this set can be ex- 
tended to a maximal set of perfectly distinguishable pure 
states On the other hand, theorem^ states that 

X = ¥>i/dA- This implies the desired decomposition 



d,A 



P = 



Si 



+ t d A (l + *o) 



where g, = pi for 1 < i < k, and qi = otherwise. I 

It is easy to show that the marginals of a pure bipartite 
state have the same spectral decomposition: 

Corollary 18 Let "J G Sti(AB) be a pure state, and let 
p and p be the marginals of "3/ on systems A and B, re- 
spectively. If p has spectral decomposition p = X^=i Pifi> 
with pi > for every i = 1, . . . , r, r < d A , then p has 
spectral decomposition p = X^iPiV'i- 



Proof. Let {ai}^ 1 be the observation-test such that 
= 5ij, {bi\ r i=1 be the observation test such that 



{ai\ tp j) : 

(PAb I*)ab = Pi IvOa f° r cver y i < r - For i < r 7 define 
the pure state ipi G Sti(B) and the probability via the 
relation 



AB 
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[Note that tpi is pure due to the pure conditioning axiom] 
By definition, we have 

qi {bj\ipi) = (ai®6j|*) 
= (a>i\<Pj) 

= PiSij Vi<r,Vj<r. 

The above relation implies qi = Y2j=i Qi (bj I = 
Y^jPi^ij = Pi ano - (kjlV'i) = Hence, the states 

{ipi}i = i are perfectly distinguishable. On the other hand, 
we have (a* <E> ee| ^) = (a»| p) = Vi > r, which implies 
(o»Ia I*)ab = 0, Vi > r . Therefore, we obtained 



\p)j 



= NaI*)ab 

= X>IaI*) 



AB 



1=1 

r 

E pi iv , *)a > 



AB 



which is the desired spectral decomposition. ■ 

The spectral decomposition of states has many conse- 
quences. Here we just discuss the simplest ones, which 
are needed for the purpose of the derivation of quantum 
theory. 

A first consequence is the following lemma: 

Lemma 36 Let ip £ Sti(A) be a pure state and let a G 
Eff(A) be the unique atomic effect such that (a\ ip) = 1. 
If ip is perfectly distinguishable from p, then (a\p) = 0. 

Proof. Let us write p = Et=iiW*! with {<^i}f =1 per- 
fectly distinguishable pure states and p, > for each i. 
Now, by lemma [2^ the states {ipi , . . . ,ifk,f} are perfectly 
distinguishable, and by lemma ^4] this set can be ex- 
tended to a maximal set of perfectly distinguishable pure 
states {7m}m=i; witn 7* = Vi ior i < k and -f k+ i = tp. 
Denote by {c m }^f =1 the observation test that perfectly 
distinguishes between the states {7 m }- Note that, by 
definition, (cfc_|_i|<^) = 1 and (ck+i\<fj) = for every 
j ^ k + 1. Also, recall that Ck+i is atomic (lemma 27). 
By the duality of theorem ^| we have a = c^+i, and, 
therefore, (a\ p) = J2i=i Pi ( c k+i I ipi) = 0. ■ 

Another consequence of theorem [l^ is the following 
characterization of the completely mixed states as full 
rank states: 

Corollary 19 (Characterization of completely 
mixed states) A state p € Sti(A), written as a 
mixture p = Yli^iPifi °f a maximal set of perfectly 
distinguishable pure states {fi}£i, is completely mixed 
if and only if pi > for every i = 1, . . . , d,A ■ 



Proof. Necessity: If Pi = for some i, then p is perfectly 
distinguishable from ipi. Hence, it cannot be completely 
mixed. Sufficiency: let p mm = imn{pi,i = 1,...,<2a}- 
Then + (1 — p m in)c, where a is the 

state defined by a = 1/(1 -jw) £)i=i(P* -PminMO^i- 
Since p contains \ in its convex decomposition, and since 
X is completely mixed, we conclude that p is completely 
mixed. ■ 

In particular, for two-dimensional systems we have the 
result: 

Corollary 20 For dj± = 2 any state on the border of 
Sti(A) is pure. 

Another consequence of theorem [l(] is that every el- 
ement in the vector space Stij(A) can be written as a 
linear combination of perfectly distinguishable states: 

Corollary 21 For every £ € StR(A) there exists a max- 
imal set of perfectly distinguishable pure states {<fi}f= 1 
and a set of real numbers {cj}^ such that |£) = 

Proof. Write £ as £ = c + p— c_er, where c+, c_ > and p 
and a are normalized states. If c_ = there is nothing to 
prove, because £ is proportional to a state. Then, suppose 
that c_ > 0. Write a as a = ^iPi^i where {ipi} are 
perfectly distinguishable and define k = max{pi}. Then 
one has x+^/{ c -kdA)£ = (%— 1 / '{kdx)a) + c + / {c-kdx) P- 
Now, by definition x~ ^/{kdp^a is proportional to a state: 
indeed we have (x - 1/(Ma)c) = l/dAE^ 1 — Pi/k)ipi, 
and, by definition 1—pi/k > 0. Therefore x+l/(c_fcdA)£ 
is proportional to a state, say \+ 1/( c -^a)£ = tr, with 
t > 0. Writing r as r = Y^iQifii where {</?i}j=i is a 
maximal set of perfectly distinguishable pure states, we 
then obtain £ = (c-kdA)(tr — x) = (c-fccf A ) Ei(*5* ~ 
l/d A )Vi, which is the desired decomposition. ■ 

In quantum theory, corollary ^l] is equivalent to the 
fact that every Hermitian matrix is diagonal in a suitable 
orthonormal basis. A simple consequence of corollary |2l| 
is the following 

Corollary 22 For every system A with dA = 2 there is 
a continuous set of pure states. 

Proof. Let £ <G StR(A) be an arbitrary vector such that 
(e|£) = 0. Note that since the convex set Sti(A) can- 
not be a segment (corollary |?]), we must have Da = 
dim[Sta(A)] > 2 and, therefore, the space of vectors £ 
such that (e| £) = is at least two-dimensional. By corol- 
lary [2l], we have £ = c((pi — ^2) = 2c(<£i — x)i where 
c > 0, {ipi,ip2\ are two perfectly distinguishable pure 
states and we used the fact that x ~ + <P2)- Let 

us define tp^ := ip\. With this definition, if ip^ = ip^ 
then one has £2 = i£i for some t > 0. Now, since there 
is a continuous infinity of vectors £ (up to scaling) , there 
must be a continuous set of pure states. ■ 

We conclude this section with the dual result to the 
"spectral decomposition" of corollary EFfl: 
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Corollary 23 For every x £ Effja(A) there exists a per- 
fectly distinguishing observation-test {a,i}^ 1 and a set of 
real numbers {di}^ 1 such that (x\ = ^ di (oj|. 

Proof. Let $ G Stx(AA) be a purification of the in- 
variant state xa, where A is the conjugate system de- 
fined in corollary [l6[ Take the Choi vector \R X )^ '■= 
{x\ A |$)aa- By corollary there exists a maximal set 
of perfectly distinguishable pure states {V>-s}i=i and a 
set of real numbers {cj}^ such that \R X ) = Yli c i Vl'i)- 
Let {ai}^ 1 c Eff(A) be the observation-test such that 
l/d A \tpi)x = (o»Ia I*)aa f° r ever y i = 1, ■ • ■ (recall 



that by corollary 16 the marginal of $ on system A is 
the invariant state x A an< ^ ^A = ^ A )- The test {ai}^ 1 
is perfectly distinguishing: if {bi}^ 1 is the observation- 



test such that (b 
defined by |<£>i) A 



ipj) = Sij and fi G Sti(A) is the state 



d A (h\ A 



AA' 



then we have 



dA (di <S 
5a- 



Moreover, we have 



WaI$)aa = I^)a 

i 

= J2c z d A (ai\ A \& 



AA 



Since <3> is dynamically faithful, this implies (x\ 
J2i di where di := CidA- ■ 



IX. TELEPORTATION REVISITED 

In this section we revisit probabilistic telcportation 
using the results about informational dimension. The 
key point is the section will be the proof the equality 
Da = d A , which relates the dimension of the vector space 
Str(A) with the informational dimension d A . 



A. Probability of teleportation 

We start by showing a probabilistic teleportation 
scheme that achieves success probability pa = 1 /dA for 
every system A: 



Theorem 11 (Probability of teleportation) For 

every system A, probabilistic teleportation can be 
achieved with probability pa 



1/4. 



Proof. Let A and |$) AA be the conjugate system and 
the pure state defined in corollary 16. Then, the state 



I^)aa I^)aa satisfies the identity 
_A 



-LD 
-CD 



AA 



On the other hand, by lemma [53| the maximum proba- 
bility of a pure state in the convex decomposition of Xaa 
is p max = l/d AA , and by corollaries [l5] and [l6] one has 
Pmax = 1/(^A^ A ) = 1/^a- Therefore, by lemma O there 
exists an atomic effect E such that 



A 
A 



(18) 



A 



and, since <£> is dynamically faithful, 
_A 

1 A 



A 



d 2 A 



(19) 



as can be verified appl ying both members of Eq. ( pj| ) to 
thus obtaining Eq. (18). ■ 



B. Isotropic states and effects 

Here we define two maps that send reversible trans- 
formations of A to reversible transformations of A: the 
transpose and the conjugate. Using these maps we will 
also define the notions of isotropic states and effects and 
we will prove some properties of them. 

Let us start from the definition of the transpose: 

Lemma 37 (Transpose of a reversible transforma- 
tion) Let $ G St(AA) be a purification of the invariant 
state xa ■ The reversible transformations of system A are 
in one-to-one correspondence with the reversible trans- 
formations of system A via the transposition r defined as 
follows 





A 


(7 










A 




A 
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[note that the transposition is defined with respect to the 
given state "3>/ 

Proof. Since (^ <8 *# A )\$>) and 1$) arc purifications 
of the same state xa, there exists a reversible transfor- 
mation a tt r G G A such that Eq. @ holds. Since $ is 
dynamically faithful on A, the map tyt h->- < % r is injective. 
Furthermore, the map is surjective: for every reversible 
f eG A the states (J*a <g> Y ) |$) and \<f>) are two purifi- 
cations of the same state x Aj and, by the uniqueness of 
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purification stated in postulate [j], there exists a reversible 
a i/ G G A such that 





A 




A 


Y 


A 









(21) 



namely Y = ■ 

The conjugate is just defined as the inverse of the trans- 
pose: 

Definition 7 Let t be the transpose defined with respect 
to the state <£> G St± ( AA) . The conjugate of the reversible 
channel G Ga is the reversible channel G G A 
defined by := (^ T )- 1 . 

We can now give the definition of isotropic pure state 
(isotropic atomic effect): 

Definition 8 A pure state *f> G St(AA) (an atomic effect 
F G Eff (AA) ) is isotropic if it is invariant under the 
(under ® ). Diagrammatically 





A 




A 






A 






A 


- - 



F 



\/W G G^ 



G G A 



(22) 



An example of isotropic state is <£>: indeed, by defini- 
tion of conjugate we have, for every G Ga, 

(flr g> <&*) |$) = (flr ® {^ T )~ x ) |$) 

= (y A ® (^ T )- 1 ^ T )|$) = |$). 

As a consequence, the teleportation effect E is isotropic: 
indeed one has 







A 








A 














A 





<2r 



4 



= 51 !• 



^ 7 



which implies {E\ ®W) = (E\, since the state $<8)$ 
is dynamically faithful. 



We now show that all isotropic pure states (isotropic 
atomic effects) are connected to the state (to the effect 
E) through a local reversible transformation. 

Lemma 38 If a pure state G Sti(AA) is isotropic then 
|^) = (Y ® <^ A )|$) for some reversible transformation 
Y G G A such that YW = °i/Y for every G G A 



Proof. Since 4* satisfies Eq. ( ]22| ), its marginal on sys- 
tem A is the invariant state |x A )- Since '5 and $ are 
purifications of the same state, there must exist a re- 
versible channel Y G G A such that |*) = (Y ® J^ A ) |$). 
Moreover, we have for every G Ga 



(Wffy- 1 <g> J^ A ) |$) 



(^r <g> ^*) |$) 
(^®^*) |*) 
I*) 



Since $ is dynamically faithful, the above equation im- 
plies fyfft- 1 = Y for every G G A . ■ 

By the duality between states and effects, it is easy to 
obtain the following: 

Lemma 39 Let A G Eff(AA) be the atomic effect such 
that (A|$) = 1. If an atomic effect F G Eff(AA) is 
isotropic then (-F| AA = (j4| A a(=^A ® ^0 f or some re- 
versible transformation Y G Ga such such that Y^ = 
fyY for every *6G A . 

Proof. Let ^ be the pure state such that (F\ ty) = 
1. Clearly f is isotropic: one has (F\ (<% ® |^) = 
= 1, and, therefore, (<W ® W*) |#) = |#). By 
lemma pq, there exists a reversible transformation Y such 
that |*) = {Y~ x ® y A ) |$) and Y~ x ^ = ^Y" 1 for 
every % G G A Now, this implies [F\ (Y^ 1 ® J^ A ) |$) = 
(F| *) = 1, which by theorem | implies (F\ = (A\ (Y ® 
^ A ).B 

As a consequence, every isotropic effect is connected 
to the teleportation effect by a local reversible transfor- 
mation: 



Corollary 24 If an atomic effect F G Eff(AA) is 

isotropic then (-F| AA = (-^Iaa("^A ® ^) f or some re ~ 
versible transformation Y G Ga such that Y°tt = WY 
for every °k G Ga- 

Proof. Since (E\ and (F\ arc both isotropic, lemma |3^ 
implies that they are both connected to (A\ through a lo- 
cal reversible transformation, say Y and W , respectively. 
Therefore, they are connected to each other through the 
transformation WY~ X . ■ 



C. Dimension of the state space 

In this subsection we use the local distinguishability 
axiom to prove the equality Z) A = d\ (see theorem Oh. 
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As a consequence, we will be able to represent the states 
of a system A as square g?a x d-A hermitian complex matri- 
ces, that is, hermitian operators on the complex Hilbcrt 
space C dA . Theorem [l^ is thus the point where the com- 
plex field (as opposed to the real field) enters in our 
derivation. Notice that, even if the local distinguishabil- 
ity excludes quantum theory on real Hilbert spaces since 
the very beginning, to prove the emergence of complex 
Hilbert spaces we need to use all the six principles. 

Due to local distinguishability, any bipartite state \& £ 
St(AB) can be written as 

D A D b 

i*) = EE*« i««)i&)> 

i=l ] = 1 

where {{Pj}) is a basis for the vector space St]s(A) 
(St.r(B)). Similarly, a bipartite effect F & Eff(BA) can 
be written as 

-Db D a 
fc=l 2 = 1 

with (a*|a<) = 5u and [fi%\^j) — 6jk- Finally, a trans- 
formation c & from A to B can be written as 



D B -Da 

EE C * 

j=l i=l 



In this matrix representation, the teleportation diagram 
of Eq. (19) becomes 



<f>E 



d 2 ' 
"a 



(23) 



where Id a is the identity matrix in dimension Da- On 
the other hand, we also have 



1 > (E\ $) = Tr[$E] - 



D_a 

d 2 



and. therefore, 



D A < d\. 



We now show that one has the equality, using the follow- 
ing standard lemma: 

Lemma 40 With a suitable choice of basis for the vector 
space Stffi(A), every reversible transformation ^ £ Ga 
is represented by a matrix of the form 



1 










(24) 



where is an orthogonal (Z?a — 1) x (-Da — 1) matrix. 



Proof. Let be a basis for StR(A), chosen in such a 
way that the first basis vector is x> while the remaining 
vectors satisfy (e|£j) = 0, Vi = 2, . . . , Da- Such a choice 
is always possible since every vector v € Stu(A) can be 
written as v = (e\v) x + £■> where £ satisfies (e|£) = 0. 
Now, since ^/ x — X, the first column of M<% must be 
(1,0,..., 0) T . Moreover, since for every normalized state 
p, % p is a normalized state, one must have (e| % |£) = 
for every £ such that (e|£) = 0. Hence, the first row 
of M<% must be (1, 0, . . . , 0), namely M<^ has the block 
form of Eq. (|24|). It remains to show that, with a suitable 
choice of basis, the matrix Oq/ in the second block can be 
chosen to be orthogonal. Observe that, by definition the 
matrices {Mty}weG A form a representation of the group 
Ga: indeed, one has Mji = Id a and M^f = M^-My 
for every , ~f 6 G. Consider the positive definite ma- 
trix P defined by the integral 



P 



where d^ is the Haar measure on the compact group Ga 
(see corollary 30 of Ref. |2lJ] for the proof of compactness) 
and A T denotes the transpose of A. By definition, one 
has P T = P and O^PO® = P for every f eG A . Let 
us now define the new representation 



OL 



p--OwP- 



obtained from Oq/ by a change of basis in the subspace 
spanned by {£i}i=V With this choice, each matrix 0' 0l 



is orthogonal: 

= p-i (OlPOw)p-i =I Da -v 



As a consequence, we have the following: 

Corollary 25 For every system A, the group of re- 
versible transformations Ga is (isomorphic to) a com- 
pact subgroup o/0(Da — 1)- 

Lemma 41 Let E G Eff(AA) be the teleportation effect 
of Eq. (H|j. Then, one has (J5|$) = 1. 

Proof. Let A € Eff(AA) be the atomic effect such that 
(A\ $) = 1. We now prove that A — E. Indeed, by corol- 
lary 24 there exists a reversible transformation V <G Ga 
such that (A\ = (E\ {~f ® J K ). Using a basis for St R (A) 
such that the transformations in Ga are represented by 
orthogonal matrices as in Eq. (pij), one has 



1 = (A|<I>) 

= {E\{y®j k )\*) 

= Tr[EM r <!>] 
= Tr[$ EMy] 
_ Tr[M r ] 
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having used Eq. ( p3| ) for the last equality. Using the 
inequality Tr[My] < Tt[I Da ], that holds for every or- 
thogonal Da x Da matrix, we then obtain 

_ Tr[My] 

< TrfeJ 
" d\ 
= Tr[E$] 

= (E\*) 

< 1, 

ans, therefore (E\ $) = 1 ■ 

Theorem 12 (Dimension of the state space) The 

dimension Da of the vector space generated by the states 
in St (A) is Da = dl. 

Proof. Using lemma [IT] and Eq. ( |23| ) we obtain 1 = 
(E\<f>) = Tr[E$] = Tv[I DA }/dl = D A /d 2 A . Hence, D A = 

dim 

An interesting consequence of the relation (E\ $) = 1 
is the following 

Corollary 26 (No inversion) Let us write an arbi- 
trary state p G Sti(A) as p = xa + £,, with (e|f) = 0. 
Then, the linear map jY defined by Jf{p) = xa — £ is 
not a physical transformation. 

Proof. Write the state $ as <& = xa <8 Xa + Since 

( e lA I $ )aa = Ix)a one must have ( e U I s )aa = °- There- 
fore, S must be of the form S = on ® fa with (e| ctj) = 
for all i. Applying the transformation jV one then ob- 
tains (JV ® ^a)^ = XA ® Xa ~ We now prove that 
this is not a state, and therefore, ,jV cannot be a phys- 
ical transformation. Let E be the teleportation effect. 
Since (E\ $) = 1, we have 1 = (J5| xa ® Xa) + (^1 S ) = 
l/d A + (S|H). Now, we have 

(sk^ ® ,f A m = (e\e) = J- - 1, 

Since this quantity is negative for every c?a > 1, the map 
JV cannot be a physical transformation. ■ 

Corollary 27 The matrix M jy defined as 











-Id A -i 



(25) 



cannot represent a physical transformation of system A. 

X. DERIVATION OF THE QUBIT 

In this section we show that every two-dimensional sys- 
tem in our theory is a qubit. With this expression we 



mean that the normalized states in Sti(A) can be rep- 
resented as density matrices for a quantum system with 
two-dimensional Hilbcrt space. With this choice of rep- 
resentation we also show that the effects in Eff(A) are all 
the positive Hermitian matrices bounded by the identity, 
and that the reversible transformations Ga act on the 
states by conjugation with unitary matrices in §U(2). 

The first step is to prove that the set of normalized 
states Sti(A) is a sphere. The idea of the proof is a simple 
geometric observation: in the ordinary three-dimensional 
space the sphere is the only compact convex set that has 
an infinite number of pure states connected by orthogo- 
nal transformations. The complete proof is given in the 
following 

Theorem 13 (The Bloch sphere) The normalized 
states of a system A with dA = 2 form a sphere and the 
group Ga is SO(3). 

Proof. According to corollary |25|, the group of reversible 
transformations Ga is a compact subgroup of the orthog- 
onal group 0(3). It cannot be the whole 0(3) because, as 
we saw in corollary |27], the inversion —I cannot represent 
a physical transformation. We now show that Ga must 
be SO (3) by excluding all the other possibilities. From 
corollary |22| we know that the system A has a continuum 
of pure states. Therefore, the group Ga must contain a 
continuous set of transformations. Now, from the clas- 
sification of the closed subgroups of 0(3) we know that 
there are only two possibilities: i) Ga is S©(3) and ii) 
Ga is the subgroup generated by SO(2), the group of ro- 
tations around a fixed axis, say the z-axis, and possibly 
by the reflections with respect to planes containing the 
z-axis. Note that the reflection in the xy-planc is forbid- 
den, because the composition of this reflection with the 
rotation of tt around the z-axis would give the inversion, 
which is forbidden by corollary ^6|. The case ii) is ex- 
cluded because in this case the action of the group Ga 
cannot be transitive. The detailed proof is as follows: 
because of the SO(2) symmetry, the set of pure states 
must contain at least a circle in the xy-planc. This circle 
will be necessarily invariant under all operations in the 
group. However, since the convex set of states is three di- 
mensional, there is at least a pure state outside the circle. 
Clearly, there is no way to transform a state on the circle 
into a state outside the circle by means of an operation in 
Ga- This is in contradiction with the fact that every two 
pure states are connected by a reversible transformation. 
Hence, the case ii) is ruled out. The only remaining al- 
ternative is i), namely that Ga = SO(3) and, hence, the 
set of pure states generated by its action on a fixed pure 
state is a sphere. ■ 

Since the convex set of density matrices on a two- 
dimensional Hilbert space is a sphere, we can represent 
the states in Sti(A) as density matrices. Precisely, we 
can choose three orthogonal axes passing through the 
center of the sphere and call them x, y, z axes, take 
<p+,ki <P-,ki k — x, y, z to be the two perfectly distinguish- 
able pure states in the direction of the A:-axis and define 



27 



c/c := <fk,+ — tfk,-- From the geometry of the sphere we 
know that any state p G Sti(A) can be written as 

\p) = \x) + \ E ^i^) E n l^^ ( 26 ) 

k—x,y,z k—x,y,z 

where the pure states are those for which J2 k=x z n\ = 
1. The Bloch representation S p of quantum state 
p is then obtained by associating the basis vectors 
X, <J xl u yi u z to the matrices 

Sx = \ (o l) Sa * = (l o) 
- (° o ) °* - \ -J 

and by defining 5 P by linearity from Eq. ( p6[ ) . Clearly, in 

this way we obtain S p = \ \ ^ ^~ Uz Ux J ( which 

\n x + in y 1 - n z J 

is the expression of a generic density matrix. Denoting by 
M2(C) the set of complex two- by- two matrices we have 
the following 

Corollary 28 (Qubit density matrices) For cLa = 2 

the set of states Sti(A) is isomorphic to the set of density 
matrices in Mi (C) through the isomorphism p i— > S p . 

Once we decide to represent the states in Sti(A) as 
matrices, the effects in Eff (A) are necessarily represented 
by matrices too. The matrix representation of an effect, 
given by the map a G Eff (A) n- E a G Afa(C) is defined 
uniquely by the relation 

Tr[£ Q S p ] = Hp) Vp G St(A). 

We then have the following 

Corollary 29 For rf A = 2 i/ie set o/ effects Eff(A) is 
isomorphic to the set of positive Hermitian matrices P G 
M 2 (C) smc/i f/iai P< 7. 

Proof. Clearly the matrix E a must be positive for every 
effect a, since we have Tr[E a S p ] = (a\p) > for every 
density matrix S p . Moreover, since we have Tr[E a S p ] = 
{o\p) < 1 for every density matrix S p , we must have E a < 
1. Finally, we know that for every couple of perfectly 
distinguishable pure states ip, ip± there exists an atomic 
effect a such that (a\(p) — 1 and (a\ip±) = 0. Since 
the two pure states tp, ip± are represented by orthogonal 
rank-one projectors S v and S v± , we must have E a = 
S v . This proves that the atomic effects are the whole set 
of positive rank-one projectors. As a consequence, also 
every positive matrix P with P < I must represent some 
effect a. ■ 

Finally, the reversible transformations are represented 
as conjugations by unitary matrices in §U(2): 



Corollary 30 For every reversible transformation °}/ G 
Ga with oIa = 2 there exists a unitary matrix U G SU(2) 
such that 

S*u P - US p tf P e St(A). (27) 

Conversely, for every U G SU(2) there exists a reversible 
transformation G Ga such that Eq. ( p^ ) holds. 

Proof. Every rotation of the Bloch sphere is represented 
by conjugation by some §U(2) matrix. Conversely, every 
conjugation by an SU(2) matrix represents some rota- 
tion on the Bloch sphere. On the other hand, we know 
that Ga is the group of all rotations on the Bloch sphere 
(theorem ||). ■ 

Note that we proved that all two-dimensional systems 
A and B in our theory have the same states (Sti(A) ~ 
Sti(B)), the same effects (Eff(A) ~ Eff(B)), and the same 
reversible transformations (Ga — Gb), but we did not 
show that A and B are operationally equivalent. For 
example, A and B could be different when we compose 
them with a third system C: the set of states Sti(AC) 
and Sti (BC) could be non-isomorphic. The fact that ev- 
ery couple of two-dimensional systems A and B are op- 
erationally equivalent will be proved later (cf. corollary 
0). 

We conclude this section with a simple fact that will 
be very useful later: 

Corollary 31 (Superposition principle for qubits) 

Let {(^1,(^2} C Sti(A) be two perfectly distinguishable 
pure states of a system A with dx = 2. Let {01,02} 
be the observation-test such that (a,i\tpj) = Sij. Then, 
for every probability < p < 1 there exists a pure state 
ip p G Sti(A) such that 

( ai \i) p )=p (a 2 \^ p ) = l-p. (28) 

Precisely, the set of pure states tJj p G Sti (A) satisfying 
Eq. (pffi) is a circle in the Bloch sphere. 

Proof. Elementary property of density matrices. I 



XI. PROJECTIONS 

In this section we define the projection on a face F of 
the convex set Sti (A) and we prove several properties of 
projections. The projection on the face F will be defined 
as an atomic operation Hp G Transf(A) that acts as the 
identity on states in the face F and that annihilates the 
states on the orthogonal face F . In the following we 
first introduce the concept of orthogonal face, then prove 
the existence and uniqueness of projections, and finally 
give some useful results on the projection of a pure state 
on two orthogonal faces. 
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A. Orthogonal faces and orthogonal complements 

In order to introduce the notion of orthogonal face we 
need first a few elementary results. We start by showing 
that there is a canonical way to associate a state to 
a face F: 

Lemma 42 (State associated to a face) Let F be a 

F I 

face of the convex set Sti(A) and let {(Pi} i= i be a maxi- 
mal set of perfectly distinguishable pure states in F. Then 
the state ujp '■= jpj Si=i Pi depends only on the face F 

IF 

and not on the particular set {^zl^i- Morever, F is the 
face identified by ujf 

Proof. Suppose that F is the face identified by p and 
let S g Transf(A,C) (resp. 9 S Transf(C,A)) be the 
encoding (resp. decoding) in the ideal compression for 
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F| 



is a max- 



p. By lemma |] and corollary 
imal set of perfectly distinguishable pure states of C 
and by theorem || one has xc = jpi Si=i $Pi- Hence, 



"f = ^ £i-=i <Pi = jw\ Ei=i @<?Pi = ®Xc- Since the 

right-hand side of the equality is independent of the par- 
tial 

ticular set {fi} i= i, the state ojf in the left-hand side 
is independent too. To prove that F is the face identi- 
fied by uip it is enough to observe that up is completely 
mixed relative to F: this fact follows from the relation 
uif = &xc and from lemma ^] H 

We now define the orthogonal complement of the state 

OJF' 

Definition 9 The orthogonal complement of the state 
uip is the state Up € Stx(A) U {0} defined as follows: 

1. if\F\ = d A , then ujj, = 

2. if F < d A , then ujp is defined by the relation 



\F\ d A -\F\ 
dA dA 



(29) 



An easy way to write the orthogonal complement is 

Fl 

Lemma 43 Take a maximal set {<fii} i=1 of perfectly dis- 
tinguishable pure states in F and extend it to a maxi- 
mal set of perfectly distinguishable pure states in 
Sti(A), then for \F\ < c?a we have 



d A 

d A - \F\ 

1 1 i=\F\+l 



ipi. 



Proof. By definition, for \F\ < c?a we have uip = 
d\-\F\ (^ A ^ A ~ I^I^f)- Substituting the expressions 

Xa = Efci Pi and u f = jp~i X)i=i Pi wc tncn obtain 
the thesis. ■ 

Note, however, that by definition the orthogonal com- 
plement ujp depends only on the face F and not on the 
choice of the maximal set in lemma |43|. 

An obvious consequence of lemma k!3| is 



Corollary 32 The states uip and uip are perfectly dis- 
tinguishable. 

IF 

Proof. Take a maximal set {fi} i=1 of perfectly dis- 
tinguishable pure states in F, extend it to a maximal 
set {<fi}f= 1 , and take the observation-test such that 
(ai\(pj) — Sij. Then the binary test {a^ ,e — of}, de- 
fined by ap '■= 5Zi=i a i distinguishes perfectly between 
ujf and uip. ■ 

We say that a state r £ Sti(A) is perfectly distinguish- 
able from the face F if r is perfectly distinguishable from 
every state a in the face F. With this definition we have 
the following 

Lemma 44 The following are equivalent: 

1. t is perfectly distinguishable from the face F 

2. t is perfectly distinguishable from ujf 

3. t belongs to the face identified by uip, i.e. t £ 

Proof. (1 2) t is perfectly distinguishable from uip 
if and only if then there exists a binary test {a, e — a} 
such that (a|r) = 1 and (cl\ljf) = 0. By lemma [l9] 
this is equivalent to the condition (o|t) = 1 and a = U>F 
0, that is, t is distinguishable from any state a in the 
face identified by Wf, which by definition is F. (2 =>■ 3) 

F I 

Let {v 3 i}i=i be a maximal set of perfectly distinguishable 



states in F, u>p = rj?r E 



|F| 



Pi 



and let {<Pi}i = \ F \ +1 be 



the maximal set of perfectly distinguishable pure states 
in the spectral decomposition r = ^2n—\F\+\PiPii w ith 
Pi > for every i = \F\ + 1, . . . , k. Since r is perfectly 
distinguishable from uip , by lemma ^3] we have that the 
states {(fii}i = i arc all perfectly distinguishable. Let us 
extend this set to a maximal set {y>i}f=i- By lemma 



have 



{</3i}^*i r i , , are in the face 



HF| + 1 



d A -\F\ St=]F|+i Pi- Hence, all the states 
F.,±. Since r is a mixture 



of these states, it also belongs to the face F u ± . (3 =>■ 

2) Since ujf and uip are perfectly distinguishable, if r 
belongs to the face identified by uip , then by lemma |22| 
r is perfectly distinguishable from uif- B 

Corollary 33 If p is perfectly distinguishable from a and 
from t, then p is perfectly distinguishable from any con- 
vex mixture of a and r. 

Proof. Let F be the face identified by p. Then by 
lemma we have er, r G F u ± . Since F u ± is a convex 
set, any mixture of a and r belongs to it. By lemma 
ff4| , this means that any mixture of a and r is perfectly 
distinguishable from p. ■ 

We are now ready to give the definition of orthogonal 
face: 

Definition 10 (Orthogonal face) The orthogonal 
face F is the set of all states that are perfectly 
distinguishable from the face F. 
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By lemma it is clear that F 1 - is the face identified by of lemma the fact that p is perfectly distinguishable 



uj?, that is F 1 - = F u ±. 

In the following we list few elementary facts about or- 
thogonal faces: 

Lemma 45 The following properties hold 
1. \F±\ = d K -\F\ 

3. ujp± = ujp 
4- ^p± = u F 
5. (F ± ) ± =F. 

Proof. Item 1. If |F| = dx the thesis is obvious. If |F| < 

\F I I F I -t- 1 F ^ 

d\, take a maximal set {</?i}^ =1 (rcsp. {<Pj}jJ^p\ +1 ) of 

perfectly distinguishable pure states in F (resp. F x ). 
Hence we have 



1 lFl 



rcsp. Upi 



\F±\ 



\F\ + \F ± \ 

E <Pi 

j=\F\+l 



By corollary [32| the states uip and uj f ± are perfectly 

I F I + 1 F^~~ 

distinguishable. Hence, the states {(Pi}[—\ are per- 
fectly distinguishable jointly (lemma E3[) . Now, we must 
have |F| + \F \ = dA, otherwise there would be a pure 
state ?/> that is perfectly distinguishable from the states 

{<Pi}iJi ■ This implies that ip belongs to F and 
that states {ip} U are perfectly distinguish- 

able in F 1 - , in contradiction with the hypotheses that the 

\F I 4- 1 F~^ I I 

set {<£j}j-_Vp| +1 is maximal in F . Item 2 Immediate 
from item 1 and definition |^. Item 3 and 4 Both items 
follow by comparison of item 2 with Eq. 29. Item 5 By 
condition 3 of lemma (F x ) is the face identified by 
the state u>p ± , which, by item 4, is ojf- Since the face 

identified by up is F, we have (F x ) = F. ■ 

We now show that there is a canonical way to associate 
an effect to a face F: 

Definition 11 (Effect associated to a face) We say 

that aF £ Eff(A) is the effect associated to the face F C 
Sti(A) if and only if aF e and aF = Wi L 0. 

In other words, the definition imposes that (ap\p) = 1 
for every p £ F and (aF | <r) = for every a € F . 

Lemma 46 A state p € Sti (A) belongs to the face F if 
and only if (ap\p) = 1. 

Proof. By definition, if p belongs to F, then (of|p) = 1- 
Conversely, if (ap\p) = 1, then p is perfectly distinguish- 
able from ujp, because (aplutp) = 0. Now, we know that 
up- is equal to uj f ± (item 4 of lemma E^). By item 2 



from oj f ± implies that p belongs to (F^) , which is just 
F (item 5 of lemma 45 ) . ■ 

We now show that the effect a p associated to the face 
F exists and is unique. A preliminary result needed to 
this purpose is the following: 

Lemma 47 The effect ap must have the form ap — 
"y^i—l ai, where ai is the atomic effect such that (ai\ (ft) = 

\F\ 

1 and {*fi}i = i is a maximal set of perfectly distinguish- 
able pure states in F. 

Proof. By corollary^ we have that aj? can be written 
as (ap\ — ^ li di (cii| where {ai}^ 1 is a perfectly distin- 
guishing test. Moreover, since aj? is an effect, we must 
have di > forall i = 1,. . - ,dx- Now, by definition we 
have (of|o/^) = 0, which implies di (a^uiji) — for ev- 
ery i = 1, . . . , dA, that is, (aj| uip) = whenever di ^ 0. 
Let us focus on the values of i for which di ^ 0. Let 
tpi be the pure state such that (ai\ipi) = 1. The condi- 
tion (ai\ujp^ = implies that ipi is perfectly distinguish- 
able from Up. Therefore, (fii belongs to (F^-)^, which 
is F. Since by definition we must have (ap\ipi) = I, 
this also implies that di = 1. In summary, we proved 
that aF = ai where the prime means that the sum is 
restricted to those values of i such that <pi E F. The con- 
dition a f =ui F e also implies that the number of terms in 
the sum must be exactly \F\. The thesis is then proved 
by suitably relabelling the effects {ai},f* 1; in such a way 
that (fii belongs to F for every i = 1, . . . , \F\. ■ 

Lemma 48 The effect ap associated to the face F is 
unique. 



Proof. Suppose that op = 2i=i a i ano - a F = 
are two effects associated to the face F, both written as in 



\F\ 



lemma 47. Let {pi} i=1 (resp. {</^}- =1 ) be the maximal 
set of perfectly distinguishable pure states in F such that 
(a l \ip i ) = 1 for every i = 1, . . . , \F\ (resp. (a-|y>-) = 1 

\F ± 

for every i = and let {ipj}j—x be a maxi- 

mal set of perfectly distinguishable pure states in F . 
Since u>f and ujp are perfectly distinguishable, the states 

{Vi}'=i u {V'j}!,Ci l (resp. W^ZW^AlCi arc perfectly 
distinguishable (lemma [23]). Moreover, the set is maxi- 
mal since \F\ + \F \ = dA- Let bj be the atomic effect 
such that (bj\ipj) = 1. Then, the test that distinguishes 

the states U (resp. {^}'=i U {^l^i 1 



F^\ 



is given by {a t } l Z[ U {M|=i' ( re sp. {a-}^' U {b 3 } J=1 
and its normalization reads 



\F\ 

i=l 
1^1 



\F^\ 

E 



bj = a F + b 3 ■ > 



\F I \F \ 



3 = 1 



3 = 1 
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By comparison we obtain aj? 



Sti(B). Since lip = UJF ^a, we have 



B. Projections 

We are now in position to define the projection on a 
face: 

Definition 12 (Projection) Let F be a face o/Sti(A). 
A projection on the face F is an atomic transformation 
Hf such that 



1. n f 



2. n F =,. o 



\a) = (n F ® J^ B ) W) 

r ds 

= EE ^wift) 
t=i j=i 

= k), 



which implies n F 



(n F < 
n F $ 



4 = 



J^a ® ~^b- Finally, note 



that Wi = oj f ® xb, while — w ^ (g> ^- B . Since we have 



F 



= Up \u)p) <8> |xb) = 0, we can conclude 
0. Hence lip <8> J'b is a projection on F. 



When F is the face identified by a pure state if G Sti(A), 
we have F = {ip} and call TL{ V \ a projection on the pure 
state ip. 

The first condition in definition |l2| means that the pro- 
jection I1 F does not disturb the states in the face F. The 
second condition means that n F annihilates all states in 
the orthogonal face F . As a notation, we will indicate 
with n^; the projection on the face F 1 - , that is, we will 



use the definition II, 



n, 



An equivalent condition for n F to be a projection on 
the face F is the following: 

Lemma 49 Let {</jj} i ^ 1 be a maximal set of perfectly 
distinguishable pure states for system A. The transfor- 
mation Yip in Transf(A) is a projection on the face gen- 
erated by the subset {}Pi}^}i if and only if 

1. n F =U1F j k 

2. n F | W ) = for all I > \F\ 



Proof. The condition is clearly necessary, since by Def- 
inition [L2] Hf\<Pi) — for I > On the other hand, if 
n F |<p;) = for I > |F| then by definition of uip we have 
n F |u;^) = 0, and, therefore II F = UJ ± 0. ■ 
A result that will be useful later is: 



Lemma 50 The transformation Hp (> 
on the face F identified the state up < 



J'b is a projection 
Xb- 



Proof. II F <g> is atomic, being the product of 
two atomic transformations. We now show that n F ® 
■^b = Uf ®xb =^A ® ^b' Indeed, by the local tomogra- 
phy axiom it is easy to see that every state a G Fu F ® XB 

can be written as |er) = Y^i=\Ylij=\ (T ij \ a i) l$j')> where 
{ a i}r=i i s a basis for Span(F) and is a basis for 



In the following we will show that for every face F 
there exists a unique projection II F and we will prove 
several properties of projections. Let us start from an 
elementary observation: 

Lemma 51 Let ip be a pure state in the face F C St! (A) 
and let a G Eff(A) be the atomic effect such that (a\ ip) = 
1. If srf G Transf(A) is an atomic transformation such 
that = UF J^A; then [a\ srf = (a\. Moreover, if ap is the 
effect associated to the face F, then we have {ap\srf = 
(a F \. 

Proof. By lemma |l^, the effect (a\ s4 is atomic. Now, 
since stf\ip) = \<p), we have (a|jz/|y) = = 1- How- 

ever, by theorem || (a\ is the unique atomic effect such 

that {a\ip) = 1. Hence, (a\s/ = (a\. Moreover, writing 
I p 

ap as ap = 5Ji=i a i with (ai\ipi) = 1, <fi G F (lemma 

0), we obtain {a F \rf = E[=i {<h\* = Ei=i (<h| = 
(a F \.M 

When applied to the case of projections, the above 
lemma gives the following 

Corollary 34 Let ip be a pure state in the face F C 
Sti(A) and let a G Eff(A) be the atomic effect such that 
(a\tp) = 1. Then, we have (a|n F = (a\. Moreover, if 
of is the effect associated to the face F, then we have 
(a F \ = (a F | n F . 

The counterpart of corollary ^ is given as follows: 

Lemma 52 Let ip be a pure state in the face F 1 - and let 
b be the atomic effect such that (b\ip) = 1. Then, we have 
(b\Hp = 0. Moreover, if ap- is the effect associated to the 
face F- 1 , then we have (a~p\ n F = 0. 

Proof. By lemma [l6], the effect (b\ Hf is atomic. Hence, 
(6|n F must be proportional to an atomic effect b' with 
\\b'\\ = 1, for some proportionality constant A G [0, 1], that 
is (b\ n F = A (b'\. We want to prove that A is zero. By 
contradiction, suppose that A ^ 0. Let ip' be the pure 
state such that (b'\ip') = 1. Now, since n F \uip-) = 0, 
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we have = = X(b'\ojp), which implies 

(b'lup-) = 0. Hence, ip' is perfectly distinguishable from 

cjp, which in turn implies that ip' belongs to (F- L ) ± = F. 
We then have A = (6| n F \ip') = (b\ip') = (the last 
equality follows from the fact that ip and ip' belong to 
F 1 - and F, respectively, and hence are perfectly distin- 
guishable). This is in contradiction with the assumption 
A 7^ 0, thus concluding the proof that (b\ Hp = 0. More- 



over, writing ap 
ipi G F- 1 , we obtain 



as a . 



with (fejlVi) = 1, 

n F = o. ■ 

Combining corollary |34] and lemma |52 we obtain an 
important property of projections, expressed by the fol- 
lowing: 



Corollary 35 If Hp is a projection on the face F, then 
one has (e A \ Tip = {ap\. 

Proof. The thesis follows from corollary |34| and lemma 
|52| and from the fact that ap + ap- = e. ■ 

In the following we will see that for every face F there 
exists a unique projection. To prove that, let us start 
from the existence: 

Lemma 53 (Existence of projections) For every 
face F o/Sti(A) there exists a projection Yip. 

Proof. By lemma |l8[ there exists a system B and an 
atomic transformation s# G Transf(A, B) with {e\s,srf = 
(ap\. Then, if f Up G St(AC) is a purification of cup, we 
can define the state |£)bc := ® ^c)| v I'w f )ac- By 
lemma [l6[ £ is a pure state. Moreover, the pure states 
E and ^!^ F have the same marginal on system C: in- 
deed, we have (e B ||S) = [(e B \^]\^> UF ) = (a F \\$ UF ) 
and, by definition, ap = L0F e A , which by theorem |l| im- 
plies (a F \ \^u> F ) = (e A \ ^ and "00 are two arbi- 
trary pure states of A and B, respectively, the uniqueness 
of purification stated by Postulate [l] implies that there 
exists a reversible channel a i/ G Gab such that 



O- 



9/ 



■'7/ 



(30) 



Now, take the atomic effect b G Eff(B) such that (b\ipo) 
1, and define the transformation Hp G Transf(A) as 



A 


Hp 


A 







(W\ - 



-LD 



Applying b on both sides of Eq. pfl) we then obtain 



and, therefore, Hp = UF J? A . Moreover, the transfor- 
mation is atomic, being the composition of atomic 
transformations (lemma |l6|) . Finally, we have 11^? 0: 
indeed, by construction of 11^ we have 

{e A \H F \p) = (e A ®b\9/{ss? ®J? A )\p® <p ) 
< (e A ® e B \ ® J? A )\p®ipo) 
= (e A K|p) 
= (of|p) ■ 

This implies (eA|IIj? \w~p~) = (a F \ujp^ 0, and, therefore, 
Hp = LU ± 0. In conclusion, II ^ is the desired projection. 
■ 

To prove the uniqueness of the projection Hp we need 
two auxiliary lemmas, given in the following. 

Lemma 54 Let <!> G Sti(AA) be a purification of the 
invariant state xa, and let Hp G Transf(A) be a pro- 
jection on the face F C Sti(A). Then, the pure state 
$ F <E Sti(AA) defined by 



\*f) ■■= 
is a purification of cop 



(Ia 



{H F ®J A m 



(31) 



Proof. The state <&p is pure by lemma [l6|. Let us 
choose a maximal set of perfectly distinguishable pure 
states such that is maximal in F. Now, 

we have 

(e A l I**0aA = jfe [KF®{e- A \)\3>) AK 



d A 
\F\ 



HfIxa), 



having used the relation (caI | < I ) )aa = Ixa) (corollary 
|l6| ). We then obtain 

{e A \\^>F) AA = ^Uf\ XA ) 



i |F| 
= tmEI*) 



having used that xa = X)f=i Pi I ^A (theorem [ 
definition of II ^ . ■ 



and the 



Lemma 55 Let Hp G Transf(A) be a projection. A 
transformation c ta G Transf(A) satisfies c ta = Up ,J? A if 
and only if 



Hi 



(32) 



(Hi 
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Proof. Let <f> F be the purification of wf denned in 
lemma ||[ Since c £ = UF J^ A , we have i^i ® J) |<fr F ) = 
|$f)- In other words, we have (Iftlp (g> </) |$) = 
(LT F £g) ^) |$). Since $ is dynamically faithful, this im- 
plies that C ^TI F = Hp. Conversely, Eq. ( |32| ) implies that 
for a G F WF , ^Ict) = c £IL F \cr) = U F \a) = \a), namely 



Theorem 14 (Uniqueness of projections) TTie pro- 
jection Uf satisfying Definition |7I| zs unique. 

Proof. Let n F and H' F be two projections on the same 
face F, and define the pure states $ p and $' F as in lemma 
|54] . Now, $ F and <I> F are both purifications of the same 
state ojp g A : indeed, one has 

(e A ||$ F ) AA = ^[(e A |n F ]|$) AA 



AA 



= ^[(e A |ni,]|*) 



AA 



= MI<&f) 



A A 



having used the relation (e A 1 IIf = (of| = (e A |II F , 
which comes from corollary |34| a nd from the unique- 
ness of the effect a F (lemma |48|). By the uniqueness 
of purification, we have |<£ F ) = ® J^ A ) |<5 F ) for 
some reversible transformation g G A . This implies 
(IT F J^ A ) |$) = (^n F <g> J^ A ) |$), and, since $ is dy- 



namically faithful, n' F = 'Wlip. 

, F J"k and IT F 



Since by Definition |12| 
„ ^a, we can conclude 



J 1 A- Finally, using lemma 55 with c € = a i/ 



we have 11^ 
that a i/ = u 
we obtain U' F = <&TI F = Up. ■ 

We now show a few simple properties of projections. 
In the following, given a maximal set of perfectly dis- 
tinguishable pure states {^i}^*! and any subset V C 
{1,. . .,d A } we define (with a slight abuse of notation) 
ujy := J2iev <Pi/\V\> an d LTy as the projection on the 
face Fy := 
ated by V . 



F Uv . We will refer to Fy as the face gener- 



Lemma 56 For two 

{1, . . . , d A } one has 

UyU 



arbitrary subsets V,W C 



yiiw 



n 



Vnw- 



In particular, if V D W = one has n^ni, 



Corollary 36 (Idempotence) Every projection Hp 
satisfies the identity lip = Hp. 

Proof. Consider a maximal set of perfectly distinguish- 
able pure states { V'i such that {ipiji^y is maximal 
in F. In this way F is the face generated by V, and, 
therefore IT F = liy. The thesis follows by taking V = W 
in lemma ■ 

Corollary 37 For every state p g Sti(A) such that p g" 
F , the normalized state p' defined by 



\P') = 



Kf\p) 
{e\Ii F \ P ) 



(33) 



belongs to the face F . 



Proof. By corollary [55], we have (e|n F = (a F \. Since 
p F ± , we must have (e|II F lp) = {a F \p) > 0, and, there- 
fore, the state p' in Eq. (f33|) is well defined. Moreover, 
using the definition of p' we obtain 



(a F \p>) 



{a F \ Tip \p) 
(e|n F |p) 



= 1, 



having used corollaries |3j and |35| for the last equality. 
Finally, lemma Efl implies that p' belongs to the face F. 



Corollary 38 Let Ilr^j be the projection on the pure 
state ip G Sti(A) and a be the atomic effect such that 
(a\tp) = 1. Then for every state p G Sti(A) one has 
n M \P) = P \<fi) where p = (a\ p). 

Proof. Recall that, by corollary we have (a\ = 
(e|Il{ v }. If (a\p) = then clearly IL^jl/o) = 0. Other- 
wise ,_thc proof is a straightforward application of corol- 
laryg ■ 

We conclude the present subsection with a result that 
will be useful in the next subsection. 

Lemma 57 An atomic transformation £/ g Transf(A) 

satisfies srf = U)F J?a if ond only if 



u F ^ = n F . 



(34) 



0. 



Proof. First of all, n^n^ is atomic, being the product 
of two atomic transformations. Moreover, since the face 
Fvnw is contained in the faces Fy and Fyy, we have 
Ilylliv \p) = Uy \p) = \p) for every p G Fvnw- In other 
words, nyn^y =u: vnw ^A- Moreover if I ^ V H W we 
have IL/IIty \<p{) = 0. By lemma and and by the 
uniqueness of projections (theorem ¥3) we then obtain 
that n^n^y is the projection on the face generated by 

vnw. * 



Proof. Suppose that srf = 0JF J^ A . Let $ G Sti(AA) be a 
purification of the invariant state xa and define the two 
pure states 

|$ F ):=^ (n F ®^ A )|$) 
\&f)=--^ (n F ^®j^ A )|$). 
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Then we have 



(e A ||^) 



N^]|$) 

(a F \\$) 

(e A ||<M, 



having used the condition (clf\^ — {o-f\ (lemma 51). 
Now, we proved that $^ and &' F have the same marginal 
on system A. By the uniqueness of purification, there 
exists a reversible transformation Y <G Ga such that 
\$' F ) = {V ® ^x) \®f)- Since $ is dynamically faith- 
ful, this implies Ufs/ = Ylip- Now, for every p in 
F one has V\p) = TIl F \p) = IL F s/\p) = \p), namely 
f =u> F J?A- Applying lemma ||| with <tf = "flip and 
using the idempotence of projections we then obtain 

U F ssf = flip 

= (YU F )llp 
= UpUp 
= IL F . 

Conversely, suppose that Eq. (|3~i| ) is satisfied. Let ip G F 
be a pure state in F and a be the atomic effect such that 
(a | (p) = 1. Then, we have 

(o| a/ \<p) = (a\U F £/\ip) = (a\U F \ip) = (o| <p) = 1, 

having used the relation (a|n^ = (a| (corollary 34). 
Then, by theorem ^, srfip — p. Since tp G F is arbitrary, 
this implies si = UF J?a- • 



C. Projection of a pure state on two orthogonal 

faces 

In Section |x] we proved a number of results concern- 
ing two-dimensional systems. Some properties of two- 
dimensional systems will be extended to the case of 
generic systems using the following lemma: 

Lemma 58 Consider a pure state tp G Sti(A) and two 
complementary projections Hf andTip. Then, ip belongs 
to the face identified by the state \ ff) := (Hf + Hp) \<p). 

Proof. If Hp \ (p) = (resp. LT^ \ip) = 0), then there is 
nothing to prove: this means that Hp \ip) = \ip) (resp. 
Hf \tp) = \tp)) and the thesis is trivially true. Suppose 
now that TLf\<p) ^ and 11^ \ip) ^ 0. Using the notation 
III := ILf, II2 := LT^, we can define the two pure states 
\<Pi) := II* I v) / (e| IIj|y)), i = 1,2, and the probabilities 
Pi = (e| 11, \ip). In this way we have IL, \tp) = pi \pi) for 
i = 1,2 and 8 = p\p\ + P2P2- Taking the atomic effect 
(a^l such that (a,i\ipi) — 1 we have ap e = a\ + 02, where 
aF g is the effect associated to the face Fg. Recalling that 
(aj| n, = (ai| for i = 1,2 (corollary ^J) , we then conclude 
the following 



(a Fe \<p) = [(ai| + {a 2 \]\tp) 

= (oi|IIi|y>) + (a 2 |n 2 |</>) 

= 2J Pi {a-ilVi) = !• 



Finally, lemma [h| yields ip G Fg . ■ 

A consequence of lemma |5^ is the following 

Lemma 59 Let ip G Sti(A) be a pure state, a G Eff(A) 
be the unique atomic effect such that (a\ip) = 1, and F 
be a face in Sti(A). If p is perfectly distinguishable from 
H F \tp) and from Hp \ip) then p is perfectly distinguishable 
from \ip). In particular, one has (a| p) = 0. 

Proof. Since p is perfectly distinguishable from lip \<p) 
and \p) 1 it is also perfectly distinguishable from any 
convex combination of them (corollary |33j) . Equivalently, 
p is perfectly distinguishable from the face Fg identified 
by \9) := H F \(p) + Hp \<p). In particular, it must be 
perfectly distinguishable from tp, which belongs to Fg by 
virtue of lemma |38|. If a is the atomic effect such that 
(a| tp) = 1, then by lemma 36 we have (a| p) = 0. H 
A technical result that will be useful in the following 

is: 

Lemma 60 Let ip 

that Up \tp) ^ 
pure states \tpi) :- 



G Sti(A) be 



n^)/( e |n^) 



H F )\p). Then, 



and the 
we have 



pure state such 
p \i->j -/- 0. Define the 
Tip \tp) I (e| Up \tp) and \tp 2 ) := 
mixed state \6) := (Hf + 



and lip \cp) 7^ 



ILpILp e 

n F n F0 



{<P2}- 



Proof. Let {V , »}i=i be a maximal set of perfectly dis- 
tinguishable pure states in F, chosen in such a way 
that ipi = ipi, and let {V'*}t=|.F|+i ^ e a max i ma l se t 
of perfectly distinguishable pure states in F 1 - , chosen in 
such a way that 4>\f\+i = <p2- Defining the sets V := 
{1, . . . , \F\], W := {\F\+l, d A }, and U := {1, |F| + 1} 

U Fe . Using 



, -1,- 

we then have Ily = Hp, Hyy 
lemma 5q we obtain 



n F n 



and Hu 



= n 
= n 



Wi} 

{^1} 



and 



TlwKu 

n {V>|F|+l} 
n {¥>2> 



We conclude this subsection with an important obser- 
vation about the group of reversible transformations that 
act as the identity on two orthogonal faces F and F 1 - . If 
F is a face of Sti(A), let us define G F F ± as the group 
of all reversible transformations G Ga such that 



J? A- 



i=l,2 



Then wc have the following 
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Theorem 15 For every face F C Sti(A) such that F ^ 
{0} and F ^ Sti(A), the group G F F ± is topologically 
equivalent to a circle. 

Proof. Let ^ be a transformation in G FF ±, $ <G 
St(AA) be a purification of the invariant state \A and 
|$^) : = (fy ® y A ) |$) be the Choi state of . Define 



: Flo f ( 

n F <g 



and F 1 - = F, 



the orthogonal faces F 
and the projections 14^. : 

(see lemma ^) . Using lemma |57j we then obtain 



J K and 14^; 



n 



A 



n # |$^) = (n F ® 
= (n F <^ 



(n F 
J£l 

d A 



^ A )|$) 



I*/ 



and, similarly, 



(n£<ar 

(Tip® 
\F±\ 
d A 



|4\ 



0- 



This means that the projections of $^ on the faces F 
and .F^ arc independent of Also, it means that 
$^ belongs to the face Fg identified by the state \9) := 

dT \® F ) + \® F± ) (lemma |||). Now, by the com- 
pression axiom, Fg is isomorphic to the state space of a 
qubit, say with $ F and indicating the north and 

south poles of the Bloch sphere, respectively, and we 
know that all the Choi states {& , w}'&£G F F± are at the 
same latitude (precisely, the latitude is the angle £ given 
by cos£ = (|-F| — {F^D/d^. This implies that the states 
{%}*eG FFl are a subset of a circle Cq in the Bloch 
sphere describing the face Fg. Precisely, the circle is 
given by 

C c := {* G F e | H^ f} \^)= 1 I1^f), 



n 



.}!*) = 



\F±\ 
d A 



|$ F x)} 



We now prove that in fact they are the whole circle. Let 
$ be a state in C^. Since belongs to the face Fg, we 
obtain 

(n F ® j k ) |*) = |*) 

= n^n Fs |*) 
= n {$F} |*) 



[the third equality comes from lemma [30] with the sub- 
stitutions F F, y> -» tpi — > < E > F , and tp% — > 
and, similarly, 



(n£ ® ^ A ) |*) 



n±|*) 
n^n F9 1 vi/) 



rf A 



Therefore, we have 



(eA||*) = [(0ir| + (4|]|*) 



[(e A |n F 

\F 



M 1$) 



^]|*) + [(e A |I4^ 



I*) 



(e A | |*i 



^a] I*) 



d A y ""~' F d A 
[(e A |n F ® J^]|$) + [(e A |n^ 

[(a F \ + {a F \]\$) 

\x A ). 



Since \1/ and $ are both purifications of the invariant 
state XA' ^ the uniqueness of purification there must 
be a reversible transformation S G A such that = 
<g> J K ) |$). Finally, it is easy to check that I4 F ^ = 
LT F and Upty = 11;^, which, by lemma 57 implies % ~ UF 
,f A and = u ± J? A . This proves that the Choi states 
{^/j^eG j_ are the whole circle Q. Since the Choi 



isomorphism is continuous in the operational norm (see 
theorem 14 of the group G FF x is topologically 

equivalent to a circle. I 



XII. THE SUPERPOSITION PRINCIPLE 

The validity of the superposition principle, proved for 
two-dimensional systems using the geometry of the Bloch 
sphere (corollary [H]) , can be now extended to arbitrary 
systems thanks to lemma 

Theorem 16 (Superposition principle for general 

systems) Let {^i}^ C Sti(A) be a maximal set of 
perfectly distinguishable pure states and {cii}^ be the 
observation-test such that (ai\(pj) = Sij. Then, for every 

choice of probabilities {Pi}f=i, Pi > 0,2^=1 Pi = 1 there 
exists at least one pure state tp p € Sti(A) such that 



Pi = (ai\ifip) 
or, equivalently, 



Vi = 1, 



,d A . 



(35) 



R{<Pi}\<Pp) =Pi\Pi) Vi = l,...,d A , (36) 
where Tl{ Vi } is the projection on ipi. 
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Proof. Let us first prove the equivalence between Eqs. 
© and ©. From Eq. (§|) we obtain Eq. @ using 
the relation (e| EE{i} = («j|, which follows from corollary 
Bp . Conversely, from Eq. ([[B]) we obtain Eq. ( |36| ) using 
corollary |3^ . Now, we will prove Eq. (|3^) by induction. 
The statement for N = 2 is proved by corollary [[]]. As- 
sume that the statement holds for every system B of di- 
mension d,B = N and suppose that g?a = N + l. Let F be 
the face identified by ui F = V-^E^Ii Pi an d F ± t> e the 
orthogonal face, identified by the state pn+i- Now there 
are two cases: either pn+i = 1 o^Pn+i ^ 1- Ifpw+i = 1, 
then there is nothing to prove: the desired state is ipN+i- 
Then, suppose that pn+i ^ 1- Using the induction hy- 
pothesis and the compression axiom [3| we can find a state 
^ q G F such that (a^q) = 3ij with g, =p,/(l -pjv+i), 
z = 1, ...,N. Let us then define a new maximal set 
of perfectly distinguishable pure states , with 

Pi = ipq and p'n+i = <Pn+i- Note that one has 
cji? = fii that is, F is the face generated by 

the states {ip' i }^L 1 . Now consider the two-dimensional 
face F' identified by 6 — l/2(p' 1 + p' N+1 ). By corol- 
lary ^l| (superposition principle for qubits) we know that 
there exists a pure state tp G F' with (a'jjy) = 1 — Pn+i 
and (a' N+1 \<p) = Pn+i- Let us define V := {1,...,N} 
and W := {l,N + 1}. Then, we have U F = U v and 
T1 F > = Hw , and by lemma [56|, 

n F \tp)=IL F Il F ,\<p) 
= Uvnw\<p) 
= Ii W 1 }\<f) 

= n {V<jlP) 

= (1 -PN+iMq), 

having used corollary [}8] for the last equality. Finally, for 
i = 1 , . . . , N we have 

{ai\tp) = (ai\U F \ip) 

= (1 -PN+l) (ai\lpq) 
= (1 -PN+l)Qi 
= Pi- 

On the other hand we have (ajv+ilv) = ( a jv+ilv) = 

PN+l- ■ 

A. Completeness for purification 

Using the superposition principle and the spectral de- 
composition of theorem ^ we can now show that ev- 
ery state of system A has a purification in AB provided 
d B > d,A- 

Lemma 61 For every state p € Sti(A) and for every 
system B with g?b > <^A there exists a purification of p in 
Sti(AB). 

Proof. Take the spectral decomposition of p, given 
by P — ^ZiLiPifii where {p{\ arc probabilities and 



C Sti(A) is a maximal set of perfectly distin- 
guishable pure states. Let {^pi}^ be a maximal set 
of perfectly distinguishable pure states and {aj}^* 1 C 
Eff(A) (resp. {bi}fg x C Eff(B)) be the test such that 
(<Xi\tpj) = Sij (resp. (h\ipj) = Clearly, {tpi ® ipj} is 
a maximal set of perfectly distinguishable pure states for 
AB. Then, by the superposition principle (theorem |l6| ) 
there exists a pure state if? p such that (ai ® bjl^f? p ) = 
Pidij. Equivalently, we have (bi\ B \H? p ) AB = Iv*)a 
for every i = 1, ...,dA and (bi\ B \^ P ) AB — for i > 
Summing over z we then obtain {e\ B I^p)ab — 

Eti (fcl B \* P ) AB = E-^ft I^Ia = Ip)a- ■ 

In the terminology of Ref. |2l| , lemma |5l] states that 
a system B with d B > <^A is complete for the purification 
of system A. 

As a consequence of lemma ^3l] we have the following: 

Corollary 39 Every system B with d B = <^a is opera- 
tionally equivalent to the conjugate system A. 

Proof. By corollary [HI the invariant state xa € Sti (A) 
has a purification "J" in Sti(AB). By corollary the 
marginal of f on B is the invariant state xb- By def- 
inition, this means that B is a conjugate system of A. 
Since the conjugate system A is unique up to operational 
equivalence (corollary |l6|) , this implies the thesis. ■ 

B. Equivalence of systems with equal dimension 

We are now in position to prove that two systems A 
and B with the same dimension are operationally equiv- 
alent, namely that there is a reversible transformation 
from A to B. In other words, we prove that the infor- 
mational dimension classifies the systems of our theory 
up to operational equivalence. The fact that this prop- 
erty is derived from the principles, rather than being 
assumed from the start, is one of the important differ- 
ences of our work with respect to Refs. An- 
other difference is that here the equivalence of systems 
with the same dimension is proved after the derivation of 
the qubit, whereas in Refs. [^6|-|l8[ the derivation of the 
qubit requires the equivalence of systems with the same 
dimension. 

Corollary 40 (Operational equivalence of systems 
with equal dimension) Every two systems A an B with 
dh = d B are operationally equivalent. 

Proof. By corollary [59], A and B are both operationally 
equivalent to the conjugate system A. Hence, they are 
operationally equivalent to each other. ■ 

C. Reversible operations of perfectly 
distinguishable pure states 

An important consequence of the superposition prin- 
ciple is the possibility of transforming an arbitrary max- 
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imal set of perfectly distinguishable pure states into an- 
other via a reversible transformation: 

Corollary 41 Let A and B be two systems with cIa = 
c?B =: d and let {<fii}f =1 (resp. {ipi}f=i) be a max- 
imal set of perfectly distinguishable pure states in A 
(resp. H). Then, there exists a reversible transforma- 
tion & £ Transf(A, B) such that \(fi) = \tpi). 



• the pairing between a state and an effect is given 
by the trace of the product of the corresponding 
matrices. 

Using the result of theorem ^, we will then obtain that all 
the physical transformations in our theory are exactly the 
physical transformations allowed in quantum mechanics. 
This will conclude our derivation of quantum theory. 



Proof. Let $ £ St(AA) be a purification of the invariant 
state xa- Although we know that A and A are opera- 
tionally equivalent (corollary [39j) we use the notation A 
and A to distinguish between the two subsystems of AA. 
Define the pure state <pi via the relation (cij| A |3>) A A = 
2 |y>i) A , where {a^}f =1 is the observation-test such that 



(a%\<p%) 
that (di | (f>j ) 



jij- 



Let {di\f =1 be the observation-test such 



Sij . Then, by lemma I 



we have 



(^IaI*) 



AA 



(37) 



On the other hand, if {bi}f =1 is the observation-test such with 
that (bi\ipj) = S^, then using the superposition principle 
(theorem |l6|) we can construct a state ^ £ Sti(BA) such 
that {hi <S> dj\^) = Sij/d, or, equivalently, 



(5ikl*) 



(38) 



Now, $ and VP have the same marginal on system A: 
they are both purifications of the invariant state x A . 
Moreover, A and B are operationally equivalent because 
they have the same dimension (corollary ^0|) . Hence, by 
the uniqueness of purification, there must be a reversible 
transformation £ Transf(A,B) such that 

I*)ba = (^®^a)I $ )aa- (39) 
Combining Eqs. @, (§|), @) we finally obtain 

- d ® \ Vi ) A = [ty ® (di\ A ] |$) BA 



(^IaI*)ba 

1 

d 



IV>i) B > 



that is, °i/ \<fi) = \tpi) for every i = 1, . . . ,d. 



XIII. DERIVATION OF THE DENSITY 
MATRIX FORMALISM 



A. The basis 

In order to specify the correspondence between states 
and matrices we choose a particular basis for the vector 
space Stjj(A). For this purpose, we adopt the choice of 
basis used in Rcf. 16|. The basis is constructed as fol- 
lows: Let us first choose a maximal set of ^a perfectly 
distinguishable states {<^m}^ = i, and declare that they 
are the first g?a basis vectors. Then, for every m < n 
the face F mn generated by {(p m ,tp n } defines a "two- 
dimensional subsystem" : precisely, the face F mn := F Umn 



fm+fr, 



2 — can be ideally encoded in a two- 
dimensional system. Now, the convex set of states of a 
two-dimensional system is the Bloch sphere, and we can 
choose the z-axis to be the line joining the two states 
{<y9 m ,(p„}, e.g. with the positive direction of the z-axis 
being the direction from tp m to ip n . Once the direction of 
the z-axis has been specified, we can choose the x and y 
axes. Note that any couple of orthogonal directions in the 
plane orthogonal to z-axis is a valid choice for the x- and 
y-ax.es (here we do not restrict ourselves to the choice of 
a right-handed coordinate system). At the moment there 
is no relation among the different choices of axes made 
for different values of m and n. However, to prove that 
the states are represented by positive matrices, later we 
will have to find a suitable way of connecting all these 
choices of axes. 

Let tp?$.,<p™ £ F mn (^,<™ € F mn ) be the two 
perfectly distinguishable states in the direction of the x- 
axis (y-axis) and define 



vr+~vr- k = x,y. (40) 



An immediate observation is the following: 

Lemma 62 The four vectors {ip m ,(p n ,o-™ n ,cr™ n } C 
Stu(A) are linearly independent. Moreover, denoting by 
ai £ Eff(A) the atomic effect such that (ai\tpi) = 1 we 
have {ai\o~™ n ) = for every l,m,n £ {1, ...,c?a} an d 
for every k = x,y. 



The goal of this section is to show that our set of ax- 
ioms implies that 

• the set of states for a system A of dimension d&. 
is the set of density matrices on the Hilbert space 

£d A 

• the set of effects is the set of positive matrices 
bounded by the identity 



Proof. Linear independence is evident from the geom- 
etry of the Bloch sphere. Moreover, for I £" {m, n} the 
states </2™J are perfectly distinguishable from (pi, and, 
therefore (ai\o-™ n ) =0. If I £ {m,n}, since the states 
l p 1 k±, k = x,y lie on the equator of the Bloch sphere, 



know that (ai | <p™ 



run 
± 



1/2 for k = x,y. Hence, 



(a; | a 



mn\ 1 

k ) ~ 2 



ho. 
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We now show that the collection of all vectors obtained 
in this way is a basis for St.r(A). To this purpose we use 
the following 

Lemma 63 Let V C {1, . . . , c^a}, and consider the pro- 
jection Tlv ■ Then, for m £ V and n V , one has 

rivkr") = f° r k = x ,v- 

Proof. Using lemma |5^ and corollary |3^ we obtain 

n v \^) = n v ii M \^i) 

Since the face F mn is isomorphic to the Bloch sphere and 
the state since k = x,y lie on the equator of the 

Bloch sphere, we know that (a m \p™±) = \. This implies 



nl ~rci 
v\o- k 



= n v (K£)- 

2 ~ 2 



\<Pr, 



= o. 



Lemma 64 The vectors Wn}m = \ U 

Wk n }n>m=l,...,d A k=x, v form a basis for St R (A). 



Proof. Since the number of vectors is exactly d\, to 
prove that they form a basis it is enough to show that 
they are linearly independent. Suppose that there exists 
a vector of coefficients {c m } U {c™"} such that 



C m lf m + 2^ ° k a k = °- 
n>m, k—x,y 



Applying the projection H{ m ra j on both sides and using 
lemma p3 we obtain 



CmWrn) + C n \<fi n ) + C X mn \(J™ n ) 



„mn I 'i n n \ 



0. 



However, we know that the vectors {<^ m , <p„, a™", cr™ 71 } 
are linearly independent. Consequently, c m = c n = 

c mn = f° r au m ' n ; ' 



B. The matrices 

Since the state space St(A) for system A spans a real 
vector space of dimension Da = d\, we can decide to 

represent the vectors {^J^UlaJ" 1 }^^! 4 k =x,y 

as Hermitian dx x dx matrices. Precisely, we associate 
the vector ip m to the matrix S iPm defined by 



S rm S, 



rm^sm j 



the vector tr™ n to the matrix 



(41) 



(42) 



and the vector cr™ n to the matrix 



(43) 



where A can take the values +1 or — 1. The freedom in 



the choice of A will be useful in subsection XIII C , where 



we will introduce the representation of composite systems 
of two qubits. However, this choice of sign plays no role 
in the present subsection, and for simplicity we will take 
the positive sign. 

Recall that in principle any orthogonal direction in the 
plane orthogonal to the z-axis can be chosen to be the x- 
axis. In general, the other possible choices for the £-axis 
will lead to matrices of the form 



S„ 



S rrn 8s 



9 £ [0, 2tt) 



(44) 

and the corresponding choice for the y-axis will lead to a 
matrices of the form 

= i\ (S rm S sn e i6 - 5 rn 5 sm e- ie ) 9 £ [0, 2tt), 

(45) 

Since the vectors {v>m}m=i u { cr ™"}«>™=i, -^A; k=x, y 
are a basis for the real vector space Sta(A), we can ex- 
pand any state p £ St (A) on them: 



\p) = ^2p™ \<Pm) + 



n>m, k—x^y 



p m.n y m , 



H ) 

{Pm}t= 



(46) 



u 



and the expansion coefficients 
{p k nn } n >m=i....,d A - k=x, y are all real. Hence, each 
state p is in one-to-one correspondence with a Hermitian 
matrix, given by 



Sp — ^ PmSif 



Pk Saj- 

n>m, k—x,y 



(47) 



Since effects are linear functionals on states, they are 
also represented by Hermitian matrices. We will indicate 
with E a the Hermitian matrix associated to the effect 
a £ Eff(A). The matrix E a is uniquely defined by the 
relation: 

(a\p)=Tr[E a S p }. 

In the rest of the section we show that the set of matri- 
ces {S p | p £ Sti(A)} is the whole set of positive Hermi- 
tian matrices with unit trace and that the set of matrices 
{E a | a £ Eff(A)} is the set of positive Hermitian matri- 
ces bounded by the identity. 

Let us start from some simple facts: 

Lemma 65 The invariant state \A has matrix represen- 
tation S XA — where I c i A is the identity matrix in 
dimension dx- 



Proof. Obvious from the expression xa = h Y^m an< ^ 
from the matrix representation of the states {</?m}^f = i in 
Eq. 
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Lemma 66 Let a m £ Eff(A) be the atomic effect such 
that (a m |^j m ) = 1. Then, the effect a m has matrix rep- 
resentation E a such that E a = Su> . 



Proof. Let p£ St! (A) be an arbitrary state. Expanding 
p as in Eq. ( |46|) and using lemma we obtain (a m \ p) = 
p m . On the other hand, by Eq. (47) we have that p m is 
the m-th diagonal element of the matrix S p : by definition 
of [eq. ©], this implies p m = Tr[S Vm S p ]. Now, by 



construction we have Tv[E am S p ] = (a m \p) = rho T{ 
T-r[S Vm S p ] for every p £ Sti(A). Hence, E CLm = S Vm . ■ 

Lemma 67 The deterministic effect e £ Eff(A) has ma- 
trix representation E e = . 

Proof. Obvious from the expression e = y a m , com- 
bined with lemma p6| and Eq. (Ell) . ■ 



Corollary 42 For every state p £ Sti(A) one has 
Tr[S p ] = 1. 

Proof. Tr[S p ] = Tr[E e S p ] = (e\p) = 1. ■ 

Theorem 17 The matrix elements of S v for a pure state 
ip £ Sti(A) are (S v ) mn = ^/p m p n e lBmn , withJ2 
1, Omn G [0, 2ir), 6 mn = and 9 mn = -6 nn 



('a 



-I yin 



Proof. First of all, the diagonal elements of S v are given 
by [S v ] rnm = (a m \<p) [cf. Eqs. © and ©]. Denoting 

the m-th element by p m , we clearly have Ylm=iPm = 
(e| ip) = 1. Now, the projection Hr mtn y\tp) is a state 
in the face F mn , and, by our choice of representation, 
the corresponding matrix Sjj. m n y\ v ) is proportional to a 
pure qubit state (non- negative rank-one matrix). On the 
other hand, it is easy to see from Eqs. @) and (|47|) 
that Sn {m n y\ip) is the matrix with the same elements as 
S v in the block corresponding to the qubit (m, n) and 
elsewhere. In order to be positive and rank-one the 
corresponding 2x2 sub-matrix must have the off-diagonal 
elements (S v ) mn = ^/p m p n e i6mn , for some 8 mn £ [0, 2tt) 
with 9 nm = —8 mn . Repeating the same argument for all 
choices of indices m,n, the thesis follows. ■ 

Theorem 18 For a pure state cp £ Sti(A), the corre- 
sponding atomic effect a v such that {a v \tp) = 1 has a ma- 



matrices Sn {m n} \tp) an d E( a \n im n} are positive (also, re- 
call that all matrix elements outside the (m, ri) block are 
zero). Let ip^™ n ^ be the pure state in the face F mn that 
is perfectly distinguishable from II/ TO)n \ \(p). Note that, 

since ip 1 ^ 71 ^ belongs to the face F mn , it is also perfectly 

distinguishable from d A }\{m.7i] W)- Hence, </j^"™' ) 

is perfectly distinguishable from ip and, in particular, 
(a|</?] mn ' ) ) = (lemma [39]). This implies the relation 



Tr 



E (a\Tl {m<n} S\ (»„J 



(a| n {m ) n}l¥'x ) 

H^" m) ) = o. 



Now, since the matrix E( a \jj n} is positive, the above re- 



{m,n}lv)> 



where c mn > 



trix representation E v with the property that E„ 



lation implies E( a \n 

0. Finally, repeating the argument for all possible values 
of (to, n), we obtain that c„ m = c for every m,n, that 
is, E a = cS v . Taking the trace on both sides we obtain 
Tr[£" a ] = c. To prove that c = 1, we use the relation 
Tt[E a ]/d A = Hxa) = l/d A . ■ 

We conclude with a simple corollary that will be used 
in the next subsection: 

Corollary 43 Let ip £ Sti (A) be a pure state and let 
{"fi}i = i C Sti(A) be a set of pure states. If the state <p 
can be written as 

\<f) = ^2 x i hi) 

i 

for some real coefficients {xi}^ =1 , then the atomic effect 
a such that (a\tp) = 1 is given by 

01 = J2 Xi ( Ci l ' 

i 

where c\ is the atomic effect such that (ci\ji) = 1. 
Proof. For every p £ St(A) by theorem |l^ one has 
(a\p) = Tt[E a S p ] = Tr[S v S p ] = ^ x l Tr[S li S p ] 

i 

= yixjTrlEcjSp] = (ci\p) , 

i i 

thus implying the thesis. I 



Proof. We already know that the statement holds for 
c?a = 2, where we proved the Bloch sphere representa- 
tion, equivalent to the fact that states and effects are 
represented as 2 x 2 positive complex matrices, with the 
set of pure states identified with the set of all rank-one 
projectors. Let us now consider a generic system A. For 
every m < n, the face F mn generated by {<p m ,(p n } can 
be encoded in a two-dimensional system. Therefore, the 



C. Choice of axes for a two-qubit system 

If A and B are two systems with g?a = <fe = 2, then 
we can use two different types of matrix representations 
for the states of the composite systemAB: 

The first type of representation is the representation 
S v introduced through lemma ^34|: here we will refer to 
it as the standard representation. Note that there are 
many different representations of this type because for 
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every pair (m, n) there is freedom in choice of the x- and 
y-axis [cf. Eqs. (g| ) and @] 

The second type of representation is the tensor product 
representation T v , defined by the tensor product of ma- 
trices representing states of systems A and B: for a state 
\P) = EijPij \Pj), with on G St(A),& G St(B), we 
have 



E^< 



(48) 



Proof. Let {a,}| =1 (rcsp. {/3j}j =1 ) be a set of pure 
states that span Stu(A) (resp. StR(B)) and expand \J> as 
I*) = Eij c.y Ift ■)■ Th en, corollary g| yields (A| = 

Ei,jCij (oil ( & il 
such that (dj | a 



where and bj are the atomic effects 



F^E 6 ^®^ = 



1. Therefore, we have 



2 , Ci 3^ai 



where S A (resp. S B ) is the matrix representation for sys- 
tem A (resp. B). Here the freedom is in the choice of 
the axes for the Bloch spheres of qubits A and B. Since 
A and B are operationally equivalent, we will indicate 
the elements of the bases for St]g(A) and Stu(B) with the 
same letters: {ip m }m=i f° r the two perfectly distinguish- 
able pure states and {o~k}k=x,y for the remaining basis 
vectors. 

We now show a few properties of the tensor represen- 
tation. Let Fa denote the matrix corresponding to the 
effect A G Eff(AB) in the tensor representation, that is, 
the matrix defined by 



(A\p):=Tr[F A T p ] Vp € St(AB). 



(49) 



It is easy to show that the matrix representation for ef- 
fects must satisfy the analogue of Eq. ( [48| ) : 

Lemma 68 Let A G Eff(AB) be a bipartite effect, writ- 
ten as (A\ = Ei j Aij ( a i \ (frj'l- Then one has 



F A 



i-j 



where E A (resp. E^J is the matrix representing the 
single-qubit effect at (resp. bj) in the standard repre- 
sentation for qubit A (resp. Ti). 

Proof. For every bipartite state \p) = J^k i P kl \ a k) 
one has 

Ti[F A T p ) = (A\p) 

= E A vPki( a i\o-k)(bj\/3i) 



i,j,k,l 



]T A ijPu Tr[E^Jfr[EB bj S l 



i,j,k,l 



]T A ijPkl Tr[(E£ i ®E?.)T a 



i,j,k,l 



which implies the thesis. ■ 

Corollary 44 Let "J G Sti(AB) be a pure state and let 
A G Eff(AB) be the atomic effect such that (A\ *) = 1. 
Then one has Fa = 7* ■ 



Corollary 45 For every bipartite state p £ Sti(AB), 
<^A = d& = 2 one has Tr[T p ] = 1. 



Proof. For each qubit we have 



E„ 




E„ 




(50) 



Hence, £?~ 



_ Ef = I, where I is the 2 x 2 identity 
matrix. By lemma pq, we then have F eA ^ eB = 7(E)/ and, 
therefore Tr[T p ] = Tr[F eA ® eB T p ] = (e A 8> e B \p) = 1. ■ 

Finally, an immediate consequence of local distin- 
guishability is the following: 

Lemma 69 Suppose that ^ G Ga and V <G Gb are 

two reversible transformations for qubits A and B, re- 
spectively, and that U, V G SU(2) are such that 



cA 



US p x U' ! 



Vpe Sti(A) 
S^ ff = ^yt VaeSti(B). 



Then, we have T^^-yy = (U ®V)T T (W ®V rt ) /or ewer?/ 
t G Sti(AB). 

Proof. The thesis follows by linearity expanding r as 
T = Ei,j=i r u a i ® #jj where and {/Jj}* =1 are 

bases for the Stu(A) and Stu(B). ■ 

The rest of this subsection is aimed at showing that, 
with a suitable choice of matrix representation for system 
B, the standard representation coincides with the tensor 
representation, that is, S p = T p for every p G St(AB). 
This technical result is important because some prop- 
erties used in our derivation are easily proved in the 
standard representation, while the property expressed by 
lemma ^ is easily proved in the tensor representation: 
it is then essential to show that we can construct a rep- 
resentation that enjoys both properties. 

The four states {<p m <£> Vn}m n =i are dearly a maximal 
set of perfectly distinguishable pure states in AB. In the 
following we will construct the standard representation 
starting from this set. 

Lemma 70 For a composite system AB with c?a = <^b = 
2 one can choose the standard representation in such a 
way that the following equalities hold 

(51) 
(52) 
(53) 















k 


= x,y, 






k 


= x,y. 
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Proof. Let us choose single-qubit representations S A 
and S B that satisfy Eqs. ©, ©, and @. On the 
other hand, choosing tho states {ip n <E) ip n } in lexico- 
graphic order as the four distinguishable states for the 
standard representation, we have 



[S^iigd^Jrs — Si r 5 

[S 



Is 



3r«3s 



[S(p 1 ®tp 2 ]r8 
[S 'ipt®ip^\r s 



0~2rO~2s 



where U* and U T are the complex conjugate and the 

? A _ 



transpose of the matrix U G SU(2) such that S% 



US^W. 



Due to Eq. ( |56| ) and to lemma |69| the isotropic state 
$ must satisfy the condition (U g> U*)T^(W <g> U T ) = 
T$,Vt/ e SU(2). Now, the unitary representation {U ® 
U*} has two irreducible subspaces and the projectors on 
them are given by the matrices 



With this choice, we get S Vm ® Vn = S^ m ®S B n = T Vm ® Vn 
for every m,n = 1,2. This proves Eq. (|5l|). Let us now 
prove Eqs. d52j) and ([53|). Consider the two-dimensional 
face ^11,12, generated by the states ip\ ® ipi and f \ ® y>2- 
This face is the face identified by the state wn.12 := 
¥>i ® XBj and we have ^11,12 — {<Pi} &> Sti(B). Therefore 
we can choose the vectors a k 

relation c^, 1 ' 12 := <p\(&(Tk, k = 
representation we have 



' , k = x, y to satisfy the 
x, y. Now, in the standard 



[Sll' 12 ] rs = i\(6 r iS s2 - S r2 S s i) 



si 



ment for the face i<22,2i 
"521) and 



[cf. Eqs. (^2|) and (|4J)]. This implies S^n.i. 

x, y. Repeating the same argu- 
■Fii,2ij and -F21.22 we obtain the 
proof of Eqs. © and (f|)i ■ 

In order to prove that, with a suitable choice of axes, 
the standard representation coincides with the tensor 
representation — i.e. S p = T p for every p 6 St(AB) — it re- 
mains to find a choice of axes such that S ak (g ll7l = T ak ^ ai , 
k = x,y. This will be proved in the following. 



Lemma 71 Let $ € Sti(AB) be a pure state such that 
(di (8 cti| $) = (02 ® 02! $) = -?2 /swc/i a state exists due 
to the superposition principle]. With a suitable choice of 
the matrix representation S B , the state $ is represented 
by the matrix 



(1 i\ 
0000 
0000 

\i 1/ 



(54) 



Moreover, one has 



1 



XA<X>XB 



+ (7, <g>CT z ). (55) 



Proof. 

Let us start with the proof of Eq. (p4|). For every 
reversible transformation ^ e Ga, let g Gb be the 
conjugate of ^ , defined with respect to the state $. Since 
all 2 x 2 unitary (non-trivial) representations of §U(2) are 
unitarily equivalent, by a suitable choice of the standard 
representation S? for system B, one has 



(56) 



Po = t; 



Pi 



(l l\ 




\1 1/ 

( 1 -l\ 
2 
2 

\-l 1/ 



I®I-P , 



where / is the 2x2 identity matrix. The most general 
form for T$ is then the following 



\ 



7$ = x Pq + X1P1 

= {x - Xi)P + Xil 

fa + p P 
a 
a 
\ a + PJ 



having defined a := x\ and p := (xo — x{)/2. 

Now, by construction the state $ satisfies the condition 



Om| A l $ )AB = ^ Wra) B 



1,2. 



By definition of the tensor representation, the conditional 
states (a m \ A |3>)ab are described by the diagonal blocks 
of the matrix X$ : 



S, 




(02 1 1*) 




(57) 



Since the states ipi and y>2 are pure, the above matrices 
must be be rank-one. Moreover, their trace must be equal 
to (a m <g> eel $) = 1/2 (esl (p m ) = |, m = 1, 2. Then we 
have two possibilities. Either i) a = and /3 = ^ or 
q = — P = i. In the case i^, Eq. holds. In the case iz), 
to prove Eq. ( |54| ) we need to change our choice of matrix 
representation for the qubit B. Precisely, we make the 
following change: 



(58) 





-> sl = 


-sl 


s* 

O y 


^S B = 

Oy 


-s* 

O y 




^sl = 


-S* B 
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where u z := ipi — ip2- Note that the inversion of the 
axes, sending to — er^. for every k = x, y, z is not an 
allowed physical transformation, but this is not a problem 
here, because Eq. ( [58)) is just a new choice of matrix 
representation, in which the set of states of system B is 
still represented by the Bloch sphere. 

More concisely, the change of matrix representation 
S B i ^ S B can be expressed as 



Sf^Sf:=Y[Sf] T Y< 



Y := 




Note that in the new representation S B the physi- 
cal transformation is still represented as S B /p = 
U*S B U T : indeed we have 



Y[Sl» p ] T Y^ 
Y(U*S B U T ) T Y^ 
Y(U [S B ] T U^)Y^ 
(YUY^)(Y [S B ] T Y^)(YUW) 
U*(Y [Sf] T Y^)U T 

u*sfu T , 



having used the relations Y^Y = I and YUY^ = U* 
for every U £ §U(2). Clearly, the change of standard 
representation S — > S for the qubit B induces a change 
of tensor representation T — > T, where T is the tensor 



representation defined by T p ^, a := 
change of representation, we have 



S B . With this 



2a 



fl l\ 



\1 1/ 



This concludes the proof of Eq. (]54|). 

Let us now prove Eq. (J55| ) . Using the fact that by 
definition T pf£lT = (S^ ® Sf) one can directly verify the 
relation 



= s£ ® s B + \(s* ® - < ® s B y + 



This is precisely the matrix version of Eq. (|5E 

Note that the choice of S B needed in Eq. (|5jJ is com- 
patible with the choice of S B needed in lemma [70| in- 
deed, to prove compatibility we only have to show that 
the representation Sb used in Eq. (^4|) has the property 
[S B m ] rs = 5 mr 5 ms , m = 1,2. This property is automati- 
cally guaranteed by the relation (a m | A |$) AB = 1/2 \<p m ), 
m = l,2 and by Eq. @ with a = and ft = 1/2. 

Corollary 46 In the standard representation the state 



$ G Sti(AB) is represented by the matrix 
I 1 e w \ 



5„ 





1 J 



(59) 



Proof. The thesis follows from theorem [l?] and lemma 

E- ■ 

We now define the reversible transformations °i/ x ^ and 
°l/ z jl as follows 



Sty^^p — XSpX, 
Sn„ P =e-^ z S, 



e 4 ' 



X 



Z := 




(60) 



Also, we define the states \f, &z,%, and ^z,% as 

I*) := 
|$ z , f ) := 

|*,, f ) := <8> 



Lemma 72 TTie siafes *,$ z ,.j, and vf^ 
lowing tensor representation 



T * 2 



/o o o\ 

110 
110 
\0 0/ 

/o o o o\ 

1 -i 
t 1 
\0 0/ 



(l -i^ 



\i 1 / 



(61) 



Moreover, one has 



* =XA ® Xb + -^{o- x ® a x + a y ® a y - a z ® a z ) 



4> 



2 ,f =XA ® Xb + 7(0-2, ® a x + a x ® o-y + a z <g> cr z ) 



XA <E> Xb + 7(0-2, ® cr x - (7a, ® (7y - a z ® cr z ) 



(62) 



Proof. Eq. ( pjj ) is obtained from Eq. (p4) by explicit 
calculation using lemma [39] and Eq. (|60[). Then, the 
validity of Eq. (62) is easily obtained from Eq. ( |55| ) 
using the relations 

%,ir |Ox) = |<Ta,) 

Itr,) = - \<7 Z ) 
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and 



^4,tt/2 Wy) = - Wx) 
%,n/2 Wz) = Wz) ■ 



same arguments can be used for ^ z ,%'- The diag- 
onal elements of „ are obtained from the re- 

lations (ai ® oi| ^z,f) — (»2 <S> a2| ^z,f) = and 
(ai <8>a 2 |*j, 2.) = (02 (8ai|* a ,f ) = 1/2, which follow 
from Eq. (p2|). This implies that the matrix Sq, „ has 

I r Zj-j 

the form 



Lemma 73 TTie states \f r , •S^i, and $ Zr i /lave a stan- 
dard representation of the form 



5 





f u 








0) 






1 





1 











2 







1 



















<v 








r 


1 








Me 


•\ 


1 


















2 






















Aie" ie 








1 


/ 




(0 










o\ 


1 





1 




\iie %1 





2 





—file 


-ij 


1 

















0/ 



(63) 



loii/i as in corollary \^t\, 7 € [0, 27r) and A,/j £ { — 1, 1}. 

Proof. Let us start from vF First, from Eq.(^2]) it is 
immediate to obtain (ai <8> ai| = (a 2 (81 02! ^) = and 
(ai ® a 2 | = (a 2 ® ai|*) = 1/2. This gives the diago- 
nal elements of . Then, using theorem [l?] we obtain 
that must be as in Eq. (63), for some value of 7. Let 
us now consider &z,^- Again, the diagonal elements of 

the matrix S$ — are obtained from Eq. (K32j), which in 
z ' 2 — 

this case yields (ai ® a\\ $z,§) = (c*2 8) 02I $2,2) =1/2 
and (ai (g) a 2 | = (a 2 ®a 1 |$ Zj ^) = 0. Hence, by 

theorem ^ we must have 



for some value of A € [0, 27r). Now, denote by A the effect 
such that (A\ $) = 1. We then have 





/ 


1 










1 
















2 




















-i\ 








1/ 



Tr[F A T* „ ] = Trp^T* „ ] = 



having used theorem |l8|, corollary 44, and Eq. (61). 
Hence, we have Tr [S^S*^ „] = 1/2, which im- 
plies A = 6 ± J, as in Eq. (|63|). Finally, the 



2 



(0 o\ 

1 e v 

1 

\0 0/ 



for some \i £ [0, 2tt). The relation Tr[5*5*^] = 
TrpVT^ = 1/2 then implies /i = 7 ± -|. ■ 

Let us now consider the four vectors 

^(11,22) ^(11,22) ^(12,21) „(12,21) , n j f „ 

Ei , T,y , Si , Ej, ; denned as follows 



y(ll,22 

E (ll,22) = 2 / $ 



2 ( $ - XA ® XB - -CT Z ® cr, 



" XA ® XB - -cr z ® tT 2 

Si 12 ^ 21) = 2 U - X A ® XB + itr, ® ^ 
4 12 ' 21) = 2 ( - XA ® XB + \o z ® a. 



(64) 



By the previous results, it is immediate to obtain the 
matrix representations of these vectors. In the tensor 
representation, using Eqs. (p3) and (61) we obtain 



j (11,22) 



T E (12,21) 



(° 








A 




(0 








-A 


















, TU(11,22) = 
y 


















V 








>v 




V 








0/ 


(0 








(A 




/o 








o\ 








1 





, T (12,21) = 








— i 








1 











i 








\o 








>v 




V-» 








0/ 



while in the standard representation, using Eqs. (B6h and 
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we obtain 



0„(11,22) 



|SL,(11,22) 



S„(12, 21) 



Sj,(ll,22) 



/ 


e t6 \ 




















\e- te 


0/ 




( o 


0- 






















\\ie- w 





o J 


/o 


o o^ 












e- J T 







\0 


o oy 




/o 










—[lie 




\xie~ 


7 o 





\0 





0/ 



Proof. Combining lemma |7(] with lemma |74| we obtain 
that S and T coincide on the tensor products basis B x 
B, where B = {ipi,(p2,a x ,a y }. By linearity, S and T 
coincide on every state. I 

From now on, whenever we will consider a composite 
system AB where A and B are two-dimensional we will 
adopt the choice that guarantees that the standard rep- 
resentation coincides with the tensor representation. 



D. Positivity of the matrices 

In this paragraph we show that the states in our theory 
can be represented by positive matrices. This amounts 
to prove that for every system A, the set of states Sti(A) 
can be represented as a subset of the set of density ma- 
trices in dim ension d\ . This result will be completed 
in subsection XIII E, where we will see that, in fact, ev- 



Comparing the two matrix representations we are now in 
position to prove the desired result : 



ery density matrix in dimension dj± corresponds to some 
state of Sti(A). 

The starting point to prove positivity is the following: 

Lemma 75 Let A and B be two-dimensional systems. 
Then, for every pure state ^ G St(AB) one has Sq, > 0. 



Lemma 74 With a suitable choice of axes, one has 
S<r k ®<T, = T„ k ® ai for every k,l = x, y. 

Proof. For the face (11, 22), using the freedom coming 
from Eqs. (|43| ) and (H), we redefine the x and y axes so 

, (11.22) „(11.22) , , 

that a x := lu x and Act 
way we have 



(11,22) 
V 



= E 



(11,22) 
V 



In this 



Swil,22) 



T ' (11,22) 



Vfc =x,y 



I 2 , writ- 



Proof. Take an arbitrary vector V G C 
ten in the Schmidt form as \V) — Y^n=i V^~n\ v n)\wn)- 
Introducing the unitaries U, V such that U\v n ) = \n) 
and VjuVj) = \n) for every n = 1,2 then we have 



\V) = (W <g> V*)\W) 
Therefore, we have 



where \W) = £ n=1 
(V\S*\V) = (W\S ( w®y)*\W) 



X n \n)\n) 



Likewise, for the face (12, 21) we redefine the x and y 



axes so that a. 
so that we have 



(12,21) 



E< 12 ' 21) and pa y 12 > 21) 



-,(12,21) 



Oj,(12,21) 



T s (12,21) 



Vfc = x, y. 



Finally, using Eqs. (|5^), (p2|), and ( p4\ j we have the rela- 
tions 



(Ty 

0~n 



1 CT/j 



£[(11,22) 
£,(11,22) 



£(12,21) 
£(12,21) 



a x = Lf 1 ' 22 ' + Ej, 12 ' 21 ) 



Since S and T coincide on the right-hand side of each 
equality, they must also coincide on the left-hand side. I 

Theorem 19 With a suitable choice of axes, the stan- 
dard representation coincides with the tensor representa- 
tion, that is, S p = T p for every p G St(AB). 



where % and V are the reversible transformations de- 
fined by Sy/p = USpU^ and Sy p = VS p V<, respectively 
(fk and "V are physical transformations by virtue of corol- 
lary [30j). Here we used the fact that the standard two- 
qubit representation coincides with the tensor represen- 
tation and, therefore, Su^^-y^ = (U ® V)Sy(U ® V)t. 
Denoting the pure state <g> f ) \$>) by \^') we then 
have 

(V\Sa,\V) =Ai [Sw] lltll + A 2 [Sy} 22 22 
+ 2>/A]AjRe([S*/] u ,22) 



Since by theorem [I?] we have [S*']ii,22 
x/[<SV] 11,11 [»5'*']22,22e l6 ', we conclude 



(V A |S'*|V r ) =Ai [5W']h u + A2 [S , *'] 22 22 



2CQ&9\ A1A2 [-S'*']ii ] ii[5*']22,22 



— V •^l[«S'l"]ll,ll — V ^2[5'*']22,22 ) > 
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Since the vector VeC 2 i 
is positive. I 



is arbitrary, the matrix 



Corollary 47 Let C be a system of dimension dc = 4. 
Then, with a suitable choice of matrix representation the 
pure states of C are represented by positive matrices. 

Proof. The system C is operationally equivalent to 
the composite system AB, where g?a = <^b = 2. Let 
G Transf (AB, C) be the reversible transformation im- 
plementing the equivalence. Now, we know that the 
states of AB are represented by positive matrices. If we 
define the basis vectors for C by applying % to the basis 
for AB, then we obtain that the states of C are repre- 
sented by the same matrices representing the states of 
AB. ■ 

Corollary 48 Let A be a system with g?a = 3. With a 
suitable choice of matrix representation, the matrix S v is 
positive for every pure state tp G St(A). 

Proof. Let C be a system with dc = 4. By corollary 
the states of C are represented by positive matrices. De- 
fine the state u := ^(tpi + <f2 + ¥3), where {tp m }m = i are 
four perfectly distinguishable pure states. By the com- 
pression axiom, the face F u can be encoded in a three- 
dimensional system D (corollary^). In fact, since D is 
operationally equivalent to A, the face F u can be encoded 
in A. Let S G Transf (D, A) and <3 G Transf (A, D) be the 
encoding and decoding operation, respectively. If we de- 
fine the basis vectors for A by applying S to the basis 
vectors for the face F u , then we obtain that the states of 
A are represented by the same matrices representing the 
states in the face F u . Since these matrices are positive, 
the thesis follows. ■ 

From now on, for every three-dimensional system A 
we will choose the x and y axes so that S p is positive for 
every p G St(A). 

Corollary 49 Let tp G Sti(A) be a pure state with c?a = 
3. Then, the corresponding matrix S v , given by 



/ pi 

^/PlP2e" 
\y/pTple 



y/pTme m2 y/pwie 1 "™ 
P2 y/P2ple l923 



(65) 



-102 



P3 



satisfies the property 



e iQl3 _ e i(012+023) 



Equivalently, S v = 
given by \v) := {Jpl 



\v){v\, 
'pie 



where v G 



^ 3 is the vector 



Proof. The relation can be trivially satisfied when 
Pi = for some i G {1,2,3}. Hence, let us assume 
pi,P2,P3 > 0. Computing the determinant of S v one 
obtains det(5 ¥> ) = 2p 1 p 2 p 3 [cos(# 12 + #23 — #13) — !]■ Since 
S v is positive, we must have det{S v ) > 0. If Pi,p 2 ,P3 > 
the only possibility is 6*13 = #12 + #23 mod 2ir. ■ 



Corollary 49 can be easily extended to systems of arbi- 
trary dimension. To this purpose, we choose the x— and 
y— axes in such a way that the projection of every state 
p G Sti(A) on a three-dimensional face is represented by 
a positive matrix. 



Lemma 76 If tp £ St! (A) is a pure state and d& = N, 
then S v = \v)(v\, where v G C N is the vector given 
by v := { y /pI,^e- ia2 ,...,^/me- ia ^) T with a, G 
[0,2tt) Vi = 2,...,iV. 

Proof. Consider a triple V = {p, q,r} C {1,...,A}. 
Then the state IIy|y)) is proportional to a pure state of a 
three dimensional system, whose representation Su v ip is 
the 3x3 square sub- matrix of S v with elements [S v ]ki = 
y/pkPke iBh \ (k,l) G V x V. Now, corollary |9| forces the 
relation e l8pr = eA epq+eqr K Since this relation must hold 
for every choice of the triple V = {p, q,r}, if we define 
a p := 8 p i, then we have e 16 "" = e i( - e ' 1+dl ^ = e^ 9 * 1-8 * 1 ) = 
e i(a p -a q ) ^ j^. j g t ncn immediate to verify that S v = \v)(v\, 
where v = {Jpl, Jpie~ la * , . . . , Jp^e~ la «) T . U 



In conclusion, we proved the following 

Corollary 50 For every system A, the state space 
Sti(A) can be represented as a subset of the set of density 
matrices in dimension dA- 

Proof. For every state p G Sti(A) the matrix S p is 
Hermitian by construction, with unit trace by corollary 
f42] , and positive since it is a convex mixture of positive 
matrices. I 



E. Quantum theory in finite dimensions 

Here we conclude our derivation of quantum theory 
by showing that every density matrix in dimension dA 
corresponds to some state p G Sti(A). 

We already know from the superposition principle 
(lemma 16) that for every choice probabilities {pi}f= 1 
there is a pure state tp G Sti(A) such that {pij^i are 
the diagonal elements of S v . Thus, the set of den- 
sity matrices corresponding to pure states contains at 
least one matrix of the form S v = \v)(v\, with \v) = 
(y/p~i, y/pie -1 ^ 2 , . . . , Jp~dZ&~ % ^ dK )• It only remains to 
prove that every possible choice of phases G [0, 27r) 
corresponds to some pure state. 

Recall that for a face F C Sti (A) we defined the group 
G F F ± to be the group of reversible transformations ^ G 
Ga such that °i/ J' a and W = u ± J'a- We then have 
the following 



Lemma 77 Consider a system A with dA = N. Let 
{tpi}fLi C Sti(A) be a maximal set of perfectly distin- 
guishable pure states, F be the face identified by top = 
1/(A — 1) X^i 1 tpi and F 1 - its orthogonal face, identi- 
fied by the state <pn ■ If is a reversible transformation 
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S* p = us p tf u 



action of ^ is given 


by 




( 


n 

yj 


U = 


In-i 













\0 ... 





(66) 



/ 



where /jv-i *s the (N — 1) x (N — 1) identity matrix and 

Proof. Consider an arbitrary state p G Sti(A) and its 
matrix representation 



Sp — 







ft 





N 1 is a suitable vector. Since °i/ = U1F J?a 



where f G 
and % = ._l J?a, we have that 



Sxi F p 


g 


g f 


S n pP 



where g G C N 1 is a suitable vector. To prove Eq. (66), 
we will now prove that g = e i,3 f for some suitable /3 G 
[0,2tt). 

Let us start from the case N = 3. Since ffl \<Pi) = 
\<Pi) Vi = 1,2,3, we have (a^/ = (oi| Vi = 1,2,3 
(lemma [H]). This implies that 'W sends states in the face 
F 13 to states in the face F^: indeed, for every p G F 13 one 
has (ai 3 | W \p) = (a 13 |p) = 1, which implies £ F 13 
(lemma [46]). In other words, the restriction of to the 
face -Fi 3 is a reversible qubit transformation. Therefore, 
the action of % on a state p G Fi 3 must be given by 



( Pn pi 3 e 1 ^ 


\p 3 ie~ lP p 33 , 




for some P G [0, 2ii). Similarly, we can see that ^ sends 
states in the face F23 to states in the face F2 3 . Hence, 
for every a G .F23 we have 



S<Wo 



for some /?' G [0, 2n). We now show that e l13 ' = e lP . To 
see that, consider a generic state <p G Sti(A), with the 
property that p\ = (ai| p) > for every i = 1,2,3 (such 
state exists due to the superposition principle of theorem 
pit). Writing S v as in Eq. ([35]) we then have 



Pi 



v1w l((?13+w ' 



P2 



Now, since (p and ip are pure states, by corollary [l9| we 
must have 



e i6i3 _ e i(ei2+e 23 ) 



By comparison we obtain e 



iff 



J/3' 



This proves Eq. 



(B6j) for AT = 3. The proof for N > 3 is then immediate: 
for every three-dimensional face F pq N the action of % is 
given Eq. ( |66| ) for some j3 pq . However, since the two faces 
F pq N and F pq >N overlap on ip p we must have P pq = P pq <- 
Similarly P pq = p p > q . We conclude that f3 pq = ft for every 
p, q. This proves Eq. (|66| ) in the general case. ■ 

We now show that every possible phase shift in Eq. 
( |66| ) corresponds to a physical transformation: 

Lemma 78 A transformation °i/ of the form of Eq. (jfr 
is a reversible transformation for every f3 G [0, 2tt). 

Proof. By lemma |77|, the group G F>F ± is a subgroup 
of U(l). Now, there are two possibilities: either G FiF ± 
is a (finite) cyclic group or G FF ± coincides with U(l). 
However, we know from theorem [l5| that G FF ± has a 
continuum of elements. Hence, G F F ± ~ £7(1) and j3 can 
take every value in [0, 2tt). ■ 

An obvious corollary of the previous lemmas is the fol- 
lowing 

Corollary 51 The transformation %p defined by 

Sv„p = US P U\ (67) 

where U is the diagonal matrix with diagonal elements 
(1, e 1 ^ 1 , • . . , e l/3jv_1 ) is a reversible transformation for ev- 
ery vector 0:=(p 2 ,...,p N )e. [0, 2tt) x • • • x [0, 2tt). 

This leads directly to the conclusion of our derivation: 

Theorem 20 For every system A, the state space St± (A) 
is the set of all density matrices on the Hilbert space C dA . 

Proof. Let N = d\- For every choice of probabilities 
p = {pi,... ,Pn) there exists at least one pure state p p 
such that pk = (ak\<p p ) for every k = 1, . . . , iV(lemma 
|l| ). This state is represented by the matrix S v = 

\vp)(v P \ with \vp) = (vpr,v^ e " iQ2 ---v^ e 7 M ") T 

(lemma |76j). Finally, we can transform ip p with ev- 
ery reversible transformation defined in Eq. (pH]), 
thus obtaining Sq /f3V>p = Up\v p )(v p \Up where Up\v p ) = 

{VPl, y/P2e~ l{a2+fl2 \ ■ ■ ■ y/me-^ aN+ P N )) T . Since p and 
(3 are arbitrary, this means that every rank-one density 
matrix corresponds to some pure state. Taking the possi- 
ble convex mixtures we obtain that every N x N density 
matrix corresponds to some state of system A. ■ 

Choosing a suitable representation p 1— > S p , we proved 
that for every system A the set of normalized states 
Sti (A) is the whole set of density matrices in dimension 
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c?A- Thanks to the purification postulate, this is enough 
to prove that all the effects Eff(A) and all the transfor- 
mations Transf(A,B) allowed in our theory are exactly 
the effects and the transformations allowed in quantum 
theory. Precisely we have the following 

Corollary 52 For every couple of systems A and B the 
set of physical transformations Transf(A,B) coincides 
with the set of all completely positive trace-non increasing 
maps from Md A (C) to Md B (C). 



Proof. We proved that our theory has the same normal- 
ized states of quantum theory. On the other hand, quan- 
tum theory is a theory with purification and in quantum 
theory the possible physical transformations are quan- 
tum operations, i.e. completely positive trace-preserving 
maps. The thesis then follows from the fact that two 
theories with purification that have the same set of nor- 
malized states are necessarily the same (theorem |3|). ■ 



XIV. CONCLUSION 

Quantum theory can be derived from purely informa- 
tional principles. In particular, it belongs to a broad class 
of theories of information-processing that includes clas- 
sical and quantum information theory as special cases. 
Within this class, quantum theory is identified uniquely 
by the purification postulate, stating that the ignorance 
about a part is always compatible with the maximal 
knowledge of the whole in an essentially unique way. 
This postulate appears as the origin of the key features 
of quantum information processing, such as no-cloning, 
teleportation, and error correction (see also Ref. [^lf ). 
The general vision underlying the present work is that 
the main primitives of quantum information processing 
should be derived directly from the principles, without 
the abstract mathematics of Hilbcrt spaces, in order to 
make the revolutionary aspects of quantum information 
immediately accessible and to place them in the broader 
context of the fundamental laws of physics. 

Finally, we would like to comment on possible general- 
izations of our work. As in any axiomatic construction, 
one can ask how the results change when the principles 
are modified. For example, one may be interested in re- 
laxing the local distinguishability axiom and in consider- 
ing theories, like quantum theory on real Hilbert spaces, 
where global measurements are essential to characterize 
the state of a composite system. In this direction, the re- 
sults of Ref. suggest that also quantum theory on real 
Hilbcrt spaces can be derived from the purification prin- 
ciple, after that the local distinguishability requirement 
has been suitably relaxed. A possible way to weaken the 
local distinguishability requirement is to assume only the 



property of local distinguishability from pure states pro- 
posed in Ref. this property states that the proba- 
bility of distinguishing two states by local measurements 
is larger than 1/2 whenever one of the two states is pure. 
A different way to relax local distinguishability would be 
to assume the property of 2-local tomography proposed 
in Ref. p9| , which requires that the state of a multipar- 
tite system can be completely characterized using only 
measurements on bipartite subsystems. This property 
is equivalent to 2-local distinguishability, defined as the 
requirement that two different states of a multipartite 
system can be distinguished with probability of success 
larger than 1/2 using only local measurements or mea- 
surements on bipartite subsystems. 

A more radical generalization of our work would be to 
relax the assumption of causality. This would be partic- 
ularly important for the discussion of quantum gravity 
scenarios, where the causal structure is not given a pri- 
ori but is part of the dynamical variables of the theory. 
In this respect, the contribution of our work is twofold. 
First, it makes evident how fundamental is the assump- 
tion of causality in the ordinary formulation of quan- 
tum theory: the whole formalism of quantum states as 
density matrices with unit trace, quantum measurements 
as resolutions of the identity, and quantum channels as 
trace-preserving maps is crucially based on it. Techni- 
cally speaking, the fact that the normalization of a state 
is given by a single linear functional (the trace, in quan- 
tum theory) is the signature of causality. This partly 
explains the troubles and paradoxes encountered when 
trying to combine the formalism of density matrices with 
non-causal evolutions, as in Deutsch's model for close 
timelike curves gijffi]]. Moreover, given that the usual 
notion of normalization has to be abandoned in the non- 
causal scenario, and that the ordinary quantum formal- 
ism becomes inadequate, one may ask in what sense a 
theory of quantum gravity would be "quantum" . The 
suggestion coming from our work is that a "quantum" 
theory is a theory satisfying the purification principle, 
which can be suitably formulated even in the absence of 
causality [^2|. The discussion of theories with purifica- 
tion in the non-causal scenario is an exciting avenue of 
future research. 
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